简体   繁体   English

SLURM:尴尬并行程序中的尴尬并行程序

[英]SLURM: Embarrassingly parallel program inside an embarrassingly parallel program

I have a complex model written in Matlab.我有一个用 Matlab 编写的复杂模型。 The model was not written by us and is best thought of as a "black box" ie in order to fix the relevant problems from the inside would require rewritting the entire model which would take years.该模型不是我们编写的,最好将其视为“黑匣子”,即为了从内部解决相关问题,需要重新编写整个模型,这需要数年时间。

If I have an "embarrassingly parallel" problem I can use an array to submit X variations of the same simulation with the option #SBATCH --array=1-X .如果我有一个“令人尴尬的并行”问题,我可以使用数组提交相同模拟的 X 变体,并使用选项#SBATCH --array=1-X However, clusters normally have a (frustratingly small) limit on the maximum array size.然而,集群通常对最大数组大小有一个(令人沮丧的小)限制。

Whilst using a PBS/TORQUE cluster I have got around this problem by forcing Matlab to run on a single thread, requesting multiple CPUs and then running multiple instances of Matlab in the background.在使用 PBS/TORQUE 集群时,我通过强制 Matlab 在单个线程上运行,请求多个 CPU,然后在后台运行多个 Matlab 实例来解决这个问题。 An example submission script is:一个示例提交脚本是:

#!/bin/bash
<OTHER PBS COMMANDS>
#PBS -l nodes=1:ppn=5,walltime=30:00:00
#PBS -t 1-600

<GATHER DYNAMIC ARGUMENTS FOR MATLAB FUNCTION CALLS BASED ON ARRAY NUMBER>

# define Matlab options
options="-nodesktop -noFigureWindows -nosplash -singleCompThread"

for sub_job in {1..5}
do
    <GATHER DYNAMIC ARGUMENTS FOR MATLAB FUNCTION CALLS BASED ON LOOP NUMBER (i.e. sub_job)>
    matlab ${options} -r "run_model(${arg1}, ${arg2}, ..., ${argN}); exit" &
done
wait
<TIDY UP AND FINISH COMMANDS>

Can anyone help me do the equivalent on a SLURM cluster?谁能帮我在 SLURM 集群上做同样的事情?

  • The par function will not run my model in a parallel loop in Matlab. par函数不会在 Matlab 的并行循环中运行我的模型。
  • The PBS/TORQUE language was very intuitive but SLURM's is confusing me. PBS/TORQUE 语言非常直观,但 SLURM 的语言让我感到困惑。 Assuming a similarly structured submission script as my PBS example, here is what I think certain commands will result in.假设一个类似结构的提交脚本作为我的 PBS 示例,这就是我认为某些命令会导致的结果。
    • --ncpus-per-task=5 seems like the most obvious one to me. --ncpus-per-task=5 对我来说似乎是最明显的。 Would I put srun in front of the matlab command in the loop or leave it as it is in the PBS script loop?我是将 srun 放在循环中的 matlab 命令前面还是将其保留在 PBS 脚本循环中?
    • --ntasks=5 I would imagine would request 5 CPUs but will run in serial unless a program specifically requests them (ie MPI or Python-Multithreaded etc). --ntasks=5 我想会请求 5 个 CPU,但会串行运行,除非程序特别请求它们(即 MPI 或 Python 多线程等)。 Would I need to put srun in front of the Matlab command in this case?在这种情况下,我是否需要将 srun 放在 Matlab 命令的前面?

I am not a big expert on array jobs but I can help you with the inner loop.我不是阵列作业的大专家,但我可以帮助您处理内部循环。

I would always use GNU parallel to run several serial processes in parallel, within a single job that has more than one CPU available.我将始终使用GNU parallel在具有多个 CPU 的单个作业中并行运行多个串行进程。 It is a simple perl script, so not difficult to 'install', and its syntax is extremely easy.它是一个简单的perl脚本,所以“安装”并不困难,而且它的语法非常简单。 What it basically does is to run some (nested) loop in parallel.它的主要作用是并行运行一些(嵌套)循环。 Each iteration of this loop contains a (long) process, like your Matlab command.这个循环的每次迭代都包含一个(长)过程,就像你的 Matlab 命令一样。 In contrast to your solution it does not submit all these processes at once, but it runs only N processes at the same time (where N is the number of CPUs you have available).与您的解决方案相反,它不会一次提交所有这些进程,而是同时仅运行N进程(其中N是您可用的 CPU 数量)。 As soon as one finishes, the next one is submitted, and so on until your entire loop is finished.一个完成后,下一个提交,依此类推,直到您的整个循环完成。 It is perfectly fine that not all processes take the same amount of time, as soon as one CPU is freed, another process is started.并非所有进程都花费相同的时间,这很好,只要一个 CPU 被释放,另一个进程就会启动。

Then, what you would like to do is to launch 600 jobs (for which I substitute 3 below, to show the complete behavior), each with 5 CPUs.然后,您想要做的是启动 600 个作业(我用下面的 3 个替换,以显示完整的行为),每个作业有 5 个 CPU。 To do that you could do the following (whereby I have not included the actual run of matlab , but that trivially can be included):为此,您可以执行以下操作(因此我没有包括matlab的实际运行,但可以简单地包括在内):

#!/bin/bash
#SBATCH --job-name example
#SBATCH --out job.slurm.out
#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 5
#SBATCH --mem 512
#SBATCH --time 30:00:00
#SBATCH --array 1-3

cmd="echo matlab array=${SLURM_ARRAY_TASK_ID}"

parallel --max-procs=${SLURM_CPUS_PER_TASK} "$cmd,subjob={1}; sleep 30" ::: {1..5}

Submitting this job using:使用以下方法提交此作业:

$ sbatch job.slurm

submits 3 jobs to the queue.向队列提交 3 个作业。 For example:例如:

$ squeue | grep tdegeus
         3395882_1     debug  example  tdegeus  R       0:01      1 c07
         3395882_2     debug  example  tdegeus  R       0:01      1 c07
         3395882_3     debug  example  tdegeus  R       0:01      1 c07

Each job gets 5 CPUs.每个作业获得 5 个 CPU。 These are exploited by the parallel command, to run your inner loop in parallel.这些由parallel命令利用,以parallel运行您的内部循环。 Once again, the range of this inner loop may be (much) larger than 5, parallel takes care of the balancing between the 5 available CPUs within this job.再一次,这个内循环的范围可能(远)大于 5, parallel负责在这个作业中的 5 个可用 CPU 之间进行平衡。

Let's inspect the output:让我们检查输出:

$ cat job.slurm.out

matlab array=2,subjob=1
matlab array=2,subjob=2
matlab array=2,subjob=3
matlab array=2,subjob=4
matlab array=2,subjob=5
matlab array=1,subjob=1
matlab array=3,subjob=1
matlab array=1,subjob=2
matlab array=1,subjob=3
matlab array=1,subjob=4
matlab array=3,subjob=2
matlab array=3,subjob=3
matlab array=1,subjob=5
matlab array=3,subjob=4
matlab array=3,subjob=5

You can clearly see the 3 times 5 processes run at the same time now (as their output is mixed).您现在可以清楚地看到 3 次 5 进程同时运行(因为它们的输出是混合的)。

No need in this case to use srun .在这种情况下不需要使用srun SLURM will create 3 jobs. SLURM 将创造 3 个工作岗位。 Within each job everything happens on individual compute nodes (ie as if you were running on your own system).在每个作业中,一切都发生在单独的计算节点上(即就像您在自己的系统上运行一样)。


Installing GNU Parallel - option 1安装 GNU Parallel - 选项 1

To 'install' GNU parallel into your home folder, for example in ~/opt .要将 GNU 并行“安装”到您的主文件夹中,例如在~/opt

  1. Download the latest GNU Parallel .下载最新的 GNU Parallel

  2. Make the directory ~/opt if it does not yet exist如果目录尚不存在,则创建目录~/opt

     mkdir $HOME/opt
  3. 'Install' GNU Parallel: “安装”GNU Parallel:

     tar jxvf parallel-latest.tar.bz2 cd parallel-XXXXXXXX ./configure --prefix=$HOME/opt make make install
  4. Add ~/opt to your path:~/opt添加到您的路径中:

     export PATH=$HOME/opt/bin:$PATH

    (To make it permanent, add that line to your ~/.bashrc .) (要使其永久化,请将该行添加到您的~/.bashrc 。)


Installing GNU Parallel - option 2安装 GNU Parallel - 选项 2

Use conda .使用conda

  1. (Optional) Create a new environment (可选)创建新环境

    conda create --name myenv
  2. Load an existing environment:加载现有环境:

     conda activate myenv
  3. Install GNU parallel:安装 GNU 并行:

     conda install -c conda-forge parallel

Note that the command is available only when the environment is loaded.请注意,该命令仅在加载环境时可用。

While Tom's suggestion to use GNU Parallel is a good one, I will attempt to answer the question asked.虽然 Tom 建议使用 GNU Parallel 是一个很好的建议,但我将尝试回答所提出的问题。

If you want to run 5 instances of the matlab command with the same arguments (for example if they were communicating via MPI) then you would want to ask for --ncpus-per-task=1 , --ntasks=5 and you should preface your matlab line with srun and get rid of the loop.如果您想使用相同的参数运行matlab命令的 5 个实例(例如,如果它们通过 MPI 进行通信),那么您将需要--ncpus-per-task=1 , --ntasks=5并且您应该用srun你的matlab行并摆脱循环。

In your case, as each of your 5 calls to matlab are independent, you want to ask for --ncpus-per-task=5 , --ntasks=1 .在您的情况下,由于您对matlab的 5 次调用中的每一次都是独立的,因此您要要求--ncpus-per-task=5--ntasks=1 This will ensure that you allocate 5 CPU cores per job to do with as you wish.这将确保您根据需要为每个作业分配 5 个 CPU 内核。 You can preface your matlab line with srun if you wish but it will make little difference you are only running one task.如果你愿意,你可以用srun来作为你的matlab行的srun ,但是你只运行一项任务不会有什么区别。

Of course, this is only efficient if each of your 5 matlab runs take the same amount of time since if one takes much longer then the other 4 CPU cores will be sitting idle, waiting for the fifth to finish.当然,这只有在 5 个matlab每一个运行花费相同的时间时才有效,因为如果一个运行时间更长,那么其他 4 个 CPU 内核将闲置,等待第五个内核完成。

You can do it with python and subprocess, in what I describe below you just set the number of nodes and tasks and that is it, no need for an array, no need to match the size of the array to the number of simulations, etc... It will just execute python code until it is done, more nodes faster execution.您可以使用 python 和 subprocess 来完成,在我下面描述的内容中,您只需设置节点和任务的数量,就是这样,不需要数组,不需要将数组的大小与模拟的数量相匹配,等等...它只会执行 python 代码直到它完成,更多的节点更快地执行。

Also, it is easier to decide on variables as everything is being prepared in python (which is easier than bash).此外,因为一切都是在 python 中准备的(这比 bash 更容易),所以更容易决定变量。

It does assume that the Matlab scripts save the output to file - nothing is returned by this function (it can be changed..)它确实假设 Matlab 脚本将输出保存到文件 - 此函数不返回任何内容(它可以更改..)

In the sbatch script you need to add something like this:在 sbatch 脚本中,您需要添加如下内容:

#!/bin/bash
#SBATCH --output=out_cluster.log
#SBATCH --error=err_cluster.log
#SBATCH --time=8:00:00
#SBATCH --nodes=36
#SBATCH --exclusive
#SBATCH --cpus-per-task=2

export IPYTHONDIR="`pwd`/.ipython"
export IPYTHON_PROFILE=ipyparallel.${SLURM_JOBID}

whereis ipcontroller

sleep 3
echo "===== Beginning ipcontroller execution ======"
ipcontroller --init --ip='*' --nodb --profile=${IPYTHON_PROFILE} --ping=30000 & # --sqlitedb
echo "===== Finish ipcontroller execution ======"
sleep 15
srun ipengine --profile=${IPYTHON_PROFILE} --timeout=300 &
sleep 75
echo "===== Beginning python execution ======"

python run_simulations.py

depending on your system, read more here: https://ipyparallel.readthedocs.io/en/latest/process.html根据您的系统,请在此处阅读更多信息: https : //ipyparallel.readthedocs.io/en/latest/process.html

and run_simulations.py should contain something like this:和 run_simulations.py 应该包含如下内容:

import os
from ipyparallel import Client
import sys
from tqdm import tqdm
import subprocess
from subprocess import PIPE
def run_sim(x):
    import os
    import subprocess
    from subprocess import PIPE
    
    # send job!
    params = [str(i) for i in x]
    p1 = subprocess.Popen(['matlab','script.mat'] + params, env=dict(**os.environ))
    p1.wait()

    return

##load ipython parallel
rc = Client(profile=os.getenv('IPYTHON_PROFILE'))
print('Using ipyparallel with %d engines', len(rc))
lview = rc.load_balanced_view()
view = rc[:]
print('Using ipyparallel with %d engines', len(rc))
sys.stdout.flush()
map_function = lview.map_sync

to_send = []
#prepare variables  <-- here you should prepare the arguments for matlab
####################
for param_1 in [1,2,3,4]:
    for param_2 in [10,20,40]:
        to_send.append([param_1, param_2])



ind_raw_features = lview.map_async(run_sim,to_send)
all_results = []

print('Sending jobs');sys.stdout.flush()
for i in tqdm(ind_raw_features,file=sys.stdout):
    all_results.append(i)

You also get a progress bar in the stdout, which is nice... you can also easily add a check to see if the output files exist and ignore a run.您还可以在标准输出中获得一个进度条,这很好……您还可以轻松添加检查以查看输出文件是否存在并忽略运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM