简体   繁体   English

如何通过不同的节点将作业提交给SLURM?

[英]How to submit jobs to SLURM with different nodes?

I have to run multiple simulations on a cluster using sbatch. 我必须使用sbatch在群集上运行多个模拟。 In one folder I have the Python script to be run and a file to be used with sbatch: 在一个文件夹中,我有要运行的Python脚本和与sbatch一起使用的文件:

#!/bin/bash -l
#SBATCH --time=04:00:00
#SBATCH --nodes=32
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=36
#SBATCH --cpus-per-task=1
#SBATCH --partition=normal
#SBATCH --constraint=mc

module load Python

source /scratch/.../env/bin/activate

srun python3 script.py

deactivate

What I have to do is to run the same Python script but using different values for --nodes. 我要做的是运行相同的Python脚本,但对--nodes使用不同的值。 How can I do that? 我怎样才能做到这一点? Moreover, I would like to create one folder for each run where the slurm file will be saved (output), named something like "nodes_xy". 此外,我想为每次运行创建一个文件夹,其中将保存(输出)slurm文件,命名为“ nodes_xy”。

Assuming your script is named submit.sh , you can remove the --nodes from the script and run: 假设您的脚本名为submit.sh ,则可以从脚本中删除--nodes并运行:

for i in 2 4 8 16 32 64; do sbatch --nodes $i --output nodes_$i.txt, submit.sh; done

This will submit the submit.sh script with two additional parameters, --nodes and --output , the first one to control the number of nodes used, and the second to specify the name of the output file, for each value 2, 4, 8, etc. Note that all the output files will be in the current directory, you will need to develop the one-liner a bit if you really need them in separate directories. 这将提交带有两个附加参数的submit.sh脚本--nodes--output ,第一个参数控制使用的节点数,第二个参数指定输出文件的名称,每个值2、4 ,8等。请注意,所有输出文件都将位于当前目录中,如果确实需要在单独的目录中进行一些开发,则需要一点点的开发。

If the maximum allowable run time allows for it, you can perform all the runs in a single job with something like this: 如果允许的最大运行时间允许,则可以在单个作业中执行所有运行,如下所示:

#!/bin/bash -l
#SBATCH --time=04:00:00
#SBATCH --nodes=32
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=36
#SBATCH --cpus-per-task=1
#SBATCH --partition=normal
#SBATCH --constraint=mc

module load Python

source /scratch/.../env/bin/activate

for i in  2 4 8 16 32 64;
do
srun --nodes $i python3 script.py > nodes_$i
done

deactivate

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM