I don't know what dx run swiss-army-knife
does, but you're passing -icmd="${my_command}"
, where ${my_command}
expands to a string, where the already expanded ${vcf_file_dir}
is. That contains (VCFs)
and more.
From what you're trying to pass after -icmd=
, I deduce it's supposed be shell code, like something sh -c
would accept. If so, you need to build it with necessary quotes embedded in the code.
Specifically, what you're trying to pass is:
bcftools view -h mnt/project/Bulk/DRAGEN WGS/Whole genome variant call files (VCFs) (DRAGEN) [500k release]/10//1002793_file.vcf > header.txt
There are unquoted whitespaces, (VCFs)
and (DRAGEN)
look like subshells in unexpected places (this is where "unexpected token (
" comes from) and [500k release]
is a filename generation pattern. Proper quoting will fix this. Bash can quote for you.
${parameter@operator}
The expansion is either a transformation of the value of parameter or information about parameter itself, depending on the value ofoperator
. Eachoperator
is a single letter:[…]
Q
The expansion is a string that is the value of parameter quoted in a format that can be reused as input.
(source)
"Can be reused as input", I think this is exactly what you're trying to do. So where you build my_command
, it should be:
my_command="bcftools view -h ${vcf_file_dir@Q}/${sample_id@Q}_file.vcf > header.txt"
Now if you do printf '%s\n'"$my_command"
then you will see it's a properly quoted shell code.
@Q
is a feature introduced in bash-4.4-alpha. Older versions of Bash can achieve similar results by using the %q
format of printf
:
my_command="bcftools view -h $(printf %q "$vcf_file_dir")/$(printf %q "$sample_id")_file.vcf > header.txt"
Maybe some very old versions of Bash contain printf
builtin that does not recognize %q
, I don't know. If this is a problem, try an external printf
(like /usr/bin/printf
). Only if this doesn't work, add single-quotes by hand. But note such naive fix will be safe only if the values of the variables are fully under your control and they don't contain single-quotes:
# FLAWED in generalmy_command="bcftools view -h '${vcf_file_dir}/${sample_id}_file.vcf'> header.txt"
It seems the variables in question fulfill these requirements. However, in general if a rogue user or program could affect any of the variables, they could inject arbitrary shell code (e.g. see what would be the content of $my_command
if the content of $vcf_file_dir
was literally '& rm -f important_file& '
). This is a flaw of adding single-quotes by hand. The method with @Q
and the method with printf %q
should be safe even in this case.
Side note: your code contains a shebang, but in a comment you admitted you run the code with bash code.sh
. Be aware that the shebang does not matter when you run a script like this. The shebang would matter if you run ./code.sh
or so.