Slurm - 如何使用不在同一文件夹中的作业创建作业数组

Question

I have a folder structure which is like this:我有一个像这样的文件夹结构：

/home/01/01/script.R
/home/01/02/script.R
/home/01/03/script.R
/home/02/01/script.R
/home/02/02/script.R
/home/02/03/script.R
/home/03/01/script.R
/home/03/02/script.R
/home/03/03/script.R

I want to send all of these scripts jointly to the Slurm as one job array.我想将所有这些脚本作为一个作业数组一起发送到 Slurm。 However, I am running into problems because they are not in the same folder.但是，我遇到了问题，因为它们不在同一个文件夹中。 What I currently know how to do is how to send these scripts to Slurm as three separate job arrays - one of which is at /home/01 , second one at /home/02 and the third one at /home/03 .我目前知道如何将这些脚本作为三个单独的作业发送到 Slurm arrays - 其中一个位于/home/01 ，第二个位于/home/02 ，第三个位于/home/03 。 I was wondering if there was an easy way to send all nine jobs together as a part of the array, WITHOUT putting them all in a same folder - the folder structure needs to strictly stay as is here.我想知道是否有一种简单的方法可以将所有九个作业作为数组的一部分一起发送，而无需将它们全部放在同一个文件夹中——文件夹结构需要严格保持原样。

This is the script that I am currently using, which doesn't work:这是我目前正在使用的脚本，它不起作用：

#!/bin/bash
# submit_array.sh

#SBATCH --job-name=array_test
#SBATCH --mail-user=user@test.com
#SBATCH --mail-type=end
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --mem=50                      
#SBATCH --time=0-00:01:00               
#SBATCH --qos=standard

declare -a combinations
index=0
for dataset in `seq -w 01 03`
do
    for chain in `seq -w 01 03`
    do
        combinations[$index]="$dataset $chain"
        index=$((index + 1))

    done
done

parameters=(${combinations[${SLURM_ARRAY_TASK_ID}]})

dataset=${parameters[0]}
chain=${parameters[1]}

module add R

cd /home/$dataset/$chain
R CMD BATCH script.R

Any help would be appreciated, thanks!任何帮助将不胜感激，谢谢！

Answer 1

One method is to use combine the folder combinations as separate IDs in the sbatch array whose associated ${SLURM_ARRAY_TASK_ID} can be parsed through substring parameter expansion in the shell script as follows:一种方法是在 sbatch 数组中使用组合文件夹组合作为单独的 ID，其关联的${SLURM_ARRAY_TASK_ID}可以通过 shell 脚本中的 substring 参数扩展进行解析，如下所示：

sbatch -a 101,102,103,201,202,203,301,302,303./submit_array.sh

where the contents of submit_array.sh are:其中submit_array.sh的内容是：

#!/bin/bash
# submit_array.sh

#SBATCH --job-name=array_test
#SBATCH --mail-user=user@test.com
#SBATCH --mail-type=end
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --mem=50                      
#SBATCH --time=0-00:01:00               
#SBATCH --qos=standard

# job arrays do not usually support 4 digits,
# so we append "0" for the dataset variable
dataset="0${SLURM_ARRAY_TASK_ID::1}"
# then chain uses the last two digits
chain=${SLURM_ARRAY_TASK_ID:1:3}

module add R

cd /home/${dataset}/${chain}
R CMD BATCH script.R

Slurm - 如何使用不在同一文件夹中的作业创建作业数组

问题描述

1 个解决方案

解决方案1
0 2022-05-07 16:54:42

Slurm - 如何使用不在同一文件夹中的作业创建作业数组

问题描述

1 个解决方案

解决方案1 0 2022-05-07 16:54:42

解决方案1
0 2022-05-07 16:54:42