在采用命令行参数的 SLURM 上运行命令

Question

I'm completely new to using HPCs and SLURM, so I'd really appreciate some guidance here.我对使用 HPC 和 SLURM 完全陌生，所以我非常感谢这里的一些指导。

I need to iteratively run a command that looks like this我需要迭代运行一个看起来像这样的命令

kallisto quant -i '/home/myName/genomes/hSapien.idx' \
               -o "output-SRR3225412"                 \
                         "SRR3225412_1.fastq.gz"       \
                         "SRR3225412_2.fastq.gz"

where the SRR3225412 part will be different in each interation其中SRR3225412部分在每次交互中都不同

The problem is, as I found out, I can't just append this to the end of an sbatch command问题是，正如我发现的那样，我不能将其附加到sbatch命令的末尾

sbatch --nodes=1          \
       --ntasks-per-node=1 \
       --cpus-per-task=1    \
         kallisto quant -i '/home/myName/genomes/hSapien.idx' \
                        -o "output-SRR3225412"                 \
                                  "SRR3225412_1.fastq.gz"       \
                                  "SRR3225412_2.fastq.gz"

This command doesn't work.这个命令不起作用。 I get the error我收到错误

sbatch: error: This does not look like a batch script.  The first
sbatch: error: line must start with #! followed by the path to an interpreter.
sbatch: error: For instance: #!/bin/sh

I wanted to ask, how do I run the sbatch command, specifying its run parameters, and also adding the command-line arguments for the kallisto program I'm trying to use?我想问一下，我如何运行sbatch命令，指定其运行参数，并为我尝试使用的kallisto程序添加命令行参数？ In the end I'd like to have something like最后我想有类似的东西

#!/bin/bash

for sample in ...
do
    sbatch --nodes=1          \
           --ntasks-per-node=1 \
           --cpus-per-task=1    \
             kallistoCommandOnSample --arg1 a1 \
                                     --arg2 a2 arg3 a3
done

Answer 1

The error sbatch: error: This does not look like a batch script.错误sbatch: error: This does not look like a batch script. is because sbatch expect a submission script .是因为sbatch需要提交脚本。 It is a batch script, typically a Bash script, in which comments starting with #SBATCH are interpreted by Slurm as options.它是一个批处理脚本，通常是一个 Bash 脚本，其中以#SBATCH开头的#SBATCH被#SBATCH解释为选项。

So the typical way of submitting a job is to create a file, let's name it submit.sh :所以提交作业的典型方式是创建一个文件，让我们将其命名为submit.sh ：

#! /bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1

kallisto quant -i '/home/myName/genomes/hSapien.idx' \
               -o "output-SRR3225412"                 \
                         "SRR3225412_1.fastq.gz"       \
                         "SRR3225412_2.fastq.gz"

and then submit it with然后提交

sbatch submit.sh

If you have multiple similar jobs to submit, it is beneficial for several reasons to use a job array .如果您有多个类似的作业要提交，那么使用作业数组有几个好处。 The loop you want to create can be replaced with a single submission script looking like您要创建的循环可以替换为单个提交脚本，如下所示

#! /bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --array=1-10 # Replace here with the number of iterations in the loop

SAMPLES=(...) # here put what you would loop over
CURRSAMPLE=${SAMPLE[$SLURM_ARRAY_TASK_ID]}
kallisto quant -i '/home/myName/genomes/hSapien.idx' \
               -o "output-${CURRSAMPLE}"              \
                         "${CURRSAMPLE}_1.fastq.gz"    \
                         "${CURRSAMPLE}_2.fastq.gz"

As pointed out by @Carles Fenoy, if you do not want to use a submission script, you can use the --wrap parameter of sbatch :正如@Carles Fenoy指出，如果你不希望使用脚本提交，你可以使用--wrap的参数sbatch ：

sbatch --nodes=1          \
       --ntasks-per-node=1 \
       --cpus-per-task=1    \
       --wrap "kallisto quant -i '/home/myName/genomes/hSapien.idx' \
                              -o 'output-SRR3225412'                 \
                                        'SRR3225412_1.fastq.gz'       \
                                        'SRR3225412_2.fastq.gz'"

在采用命令行参数的 SLURM 上运行命令

问题描述

1 个解决方案

解决方案1
2 2020-10-30 07:40:54

在采用命令行参数的 SLURM 上运行命令

问题描述

1 个解决方案

解决方案1 2 2020-10-30 07:40:54

解决方案1
2 2020-10-30 07:40:54