简体   繁体   中英

How to immediately submit all Snakemake jobs to slurm cluster

I'm using snakemake to build a variant calling pipeline that can be run on a SLURM cluster. The cluster has login nodes and compute nodes. Any real computing should be done on the compute nodes in the form of an srun or sbatch job. Jobs are limited to 48 hours of runtime. My problem is that processing many samples, especially when the queue is busy, will take more than 48 hours to process all the rules for every sample. The traditional cluster execution for snakemake leaves a master thread running that only submits rules to the queue after all the rule's dependencies have finished running. I'm supposed to run this master program on a compute node, so this limits the runtime of my entire pipeline to 48 hours.

I know SLURM jobs have dependency directives that tell a job to wait to run until other jobs have finished. Because the snakemake workflow is a DAG, is it possible to submit all the jobs at once, with each job having its dependencies defined by the rule's dependencies from the DAG? After all the jobs are submitted the master thread would complete, circumventing the 48 hour limit. Is this possible with snakemake , and if so, how does it work? I've found the --immediate-submit command line option, but I'm not sure if this has the behavior I'm looking for and how to use the command because my cluster prints Submitted batch job [id] after a job is submitted to the queue instead of just the job id.

Immediate submit unfortunately does not work out-of-the-box , but needs some tuning for it to work. This is because the way dependencies between jobs are passed along differ between cluster systems. A while ago I struggled with the same problem. As the immediate-submit docs say:

Immediately submit all jobs to the cluster instead of waiting for present input files. This will fail, unless you make the cluster aware of job dependencies, eg via: $ snakemake –cluster 'sbatch –dependency {dependencies}. Assuming that your submit script (here sbatch) outputs the generated job id to the first stdout line, {dependencies} will be filled with space separated job ids this job depends on.

So the problem is that sbatch does not output the generated job id to the first stdout line. However we can circumvent this with our own shell script:

parseJobID.sh:

#!/bin/bash
# helper script that parses slurm output for the job ID,
# and feeds it to back to snakemake/slurm for dependencies.
# This is required when you want to use the snakemake --immediate-submit option

if [[ "Submitted batch job" =~ "$@" ]]; then
  echo -n ""
else
  deplist=$(grep -Eo '[0-9]{1,10}' <<< "$@" | tr '\n' ',' | sed 's/.$//')
  echo -n "--dependency=aftercorr:$deplist"
fi;

And make sure to give the script execute permission with chmod +x parseJobID.sh .

We can then call immediate submit like this:

snakemake --cluster 'sbatch $(./parseJobID.sh {dependencies})' --jobs 100 --notemp --immediate-submit

Note that this will submit at max 100 jobs at the same time. You can increase or decrease this to any number you like, but know that most cluster systems do not allow more than 1000 jobs per user at the same time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM