Slurm作业阵列提交严重未充分利用可用资源

Question

The SLURM job array submission isn't working as I expected. SLURM作业数组提交无法按预期工作。 When I run my sbatch script to create the array and run the programs I expect it to fully utilize all the cores that are available, however, it only allows one job from the array to run on the a given node at a time. 当我运行我的sbatch脚本来创建阵列并运行程序时，我希望它能够充分利用所有可用的核心，但是，它只允许阵列中的一个作业一次在给定节点上运行。 SCONTROL shows the job using all 36 cores on the node when I specified 4 cores for the process. 当我为进程指定4个核心时，SCONTROL使用节点上的所有36个核心显示作业。 Additionally, I want to restrict the jobs to running on one specific node, however if other nodes are unused, it will submit a job onto them as well, using every core available on that node. 此外，我想限制作业在一个特定节点上运行，但是如果其他节点未使用，它也会使用该节点上的每个可用核心向它们提交作业。

I've tried submitting the jobs by changing the parameters for --nodes, --ntasks, --nodelist, --ntasks-per-node, --cpus-per-task, setting OMP_NUM_THREADS , and specifying the number of cores for mpirun directly. 我已经尝试通过更改--nodes， - antks， - nodelist， - peras-node，每个任务--cpus，设置OMP_NUM_THREADS以及指定核心数的参数来提交作业mpirun直接。 None of these options seemed to change anything at all. 这些选项似乎都没有改变任何东西。

#!/bin/bash
#SBATCH --time=2:00:00   # walltime
#SBATCH --ntasks=1   # number of processor cores (i.e. tasks)
#SBATCH --nodes=1    # number of nodes
#SBATCH --nodelist node001
#SBATCH --ntasks-per-node=9
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=500MB   # memory per CPU core

#SBATCH --array=0-23%8

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

mpirun -n 4 MYPROGRAM

I expected to be able to run eight instances of MYPROGRAM , each utilizing four cores for a parallel operation. 我希望能够运行8个MYPROGRAM实例，每个实例使用4个内核进行并行操作。 In total, I expected to use 32 cores at a time for MYPROGRAM , plus however many cores are needed to run the job submission program. 总的来说，我希望MYPROGRAM一次使用32个核心，但是运行作业提交程序需要很多核心。

Instead, my squeue output looks like this 相反，我的squeue输出看起来像这样

JOBID          PARTITION    NAME      USER   ST   TIME  NODES CPUS
  num_[1-23%6]  any      MYPROGRAM   user   PD   0:00      1 4
  num_0         any      MYPROGRAM   user    R   0:14      1 36

It says that I am using all available cores on the node for this process, and will not allow additional array jobs to begin. 它表示我正在使用节点上的所有可用内核进行此过程，并且不允许开始其他阵列作业。 While MYPROGRAM runs exactly as expected, there is only once instance of it running at any given time. 虽然MYPROGRAM完全按预期运行，但在任何给定时间只运行一次它的实例。

And my SCONTROL output looks like this: 我的SCONTROL输出如下所示：

   UserId=user(225589) GroupId=domain users(200513) MCS_label=N/A
   Priority=4294900562 Nice=0 Account=(null) QOS=normal
   JobState=PENDING Reason=Resources Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2019-06-21T18:46:25 EligibleTime=2019-06-21T18:46:26
   StartTime=Unknown EndTime=Unknown Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2019-06-21T18:46:28
   Partition=any AllocNode:Sid=w***:45277
   ReqNodeList=node001 ExcNodeList=(null)
   NodeList=(null) SchedNodeList=node001
   NumNodes=1-1 NumCPUs=4 NumTasks=1 CPUs/Task=4 ReqB:S:C:T=0:0:*:*
   TRES=cpu=4,mem=2000M,node=1
   Socks/Node=* NtasksPerN:B:S:C=9:0:*:* CoreSpec=*
   MinCPUsNode=36 MinMemoryCPU=500M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   Gres=(null) Reservation=(null)
   OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)

   Power=

JobId=1694 ArrayJobId=1693 ArrayTaskId=0 JobName=launch_vasp.sh
   UserId=user(225589) GroupId=domain users(200513) MCS_label=N/A
   Priority=4294900562 Nice=0 Account=(null) QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:10 TimeLimit=02:00:00 TimeMin=N/A
   SubmitTime=2019-06-21T18:46:25 EligibleTime=2019-06-21T18:46:26
   StartTime=2019-06-21T18:46:26 EndTime=2019-06-21T20:46:26 Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2019-06-21T18:46:26
   Partition=any AllocNode:Sid=w***:45277
   ReqNodeList=node001 ExcNodeList=(null)
   NodeList=node001
   BatchHost=node001
   NumNodes=1 NumCPUs=36 NumTasks=1 CPUs/Task=4 ReqB:S:C:T=0:0:*:*
   TRES=cpu=36,mem=18000M,node=1,billing=36
   Socks/Node=* NtasksPerN:B:S:C=9:0:*:* CoreSpec=*
   MinCPUsNode=36 MinMemoryCPU=500M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   Gres=(null) Reservation=(null)
   OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)

   Power=

Something is going wrong in how SLURM is assigning cores to tasks, but nothing I've tried changes anything. SLURM如何为任务分配核心出了点问题，但我尝试过的任何事情都没有改变。 I'd appreciate any help you can give. 我很感激你能给予的任何帮助。

Answer 1

Check if the slurm.conf file allows consumable resources. 检查slurm.conf文件是否允许使用可使用资源。 The default is to assign nodes exclusively. 默认设置是专门分配节点。 I had to add the following lines to allow per-score scheduling 我必须添加以下行以允许按分数计划

SelectType=select/cons_res
SelectTypeParameters=CR_Core

Slurm作业阵列提交严重未充分利用可用资源

问题描述

1 个解决方案

解决方案1
0 2019-07-01 13:08:19

Slurm作业阵列提交严重未充分利用可用资源

问题描述

1 个解决方案

解决方案1 0 2019-07-01 13:08:19

解决方案1
0 2019-07-01 13:08:19