简体   繁体   English

如何为每个任务设置 1 gpu 的 slurm/salloc 但让作业使用多个 gpu?

[英]How to set slurm/salloc for 1 gpu per task but let job use multiple gpus?

We are looking for some advice with slurm salloc gpu allocations.我们正在寻找有关 slurm salloc gpu 分配的一些建议。 Currently, given:目前,给定:

% salloc -n 4 -c 2 -gres=gpu:1
% srun env | grep CUDA   
CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0

However, we desire more than just device 0 to be used.但是,我们希望使用的不仅仅是设备 0。
Is there a way to specify an salloc with srun/mpirun to get the following?有没有办法用 srun/mpirun 指定一个 Salloc 以获得以下内容?

CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=1
CUDA_VISIBLE_DEVICES=2
CUDA_VISIBLE_DEVICES=3

This is desired such that each task gets 1 gpu, but overall gpu usage is spread out among the 4 available devices (see gres.conf below).这是需要的,以便每个任务获得 1 gpu,但总体 gpu 使用分布在 4 个可用设备中(请参阅下面的 gres.conf)。 Not where all tasks get device=0.不是所有任务都得到设备=0。

That way each task is not waiting on device 0 to free up from other tasks, as is currently the case.这样每个任务就不会像当前那样等待设备 0 从其他任务中释放出来。

Or is this expected behavior even if we have more than 1 gpu available/free (4 total) for the 4 tasks?或者即使我们有超过 1 个可用/免费的 GPU(总共 4 个)用于 4 个任务,这是否是预期的行为? What are we missing or misunderstanding?我们错过了什么或误解了什么?

  • salloc / srun parameter? Salloc / srun 参数?
  • slurm.conf or gres.conf setting? slurm.conf 或 gres.conf 设置?

Summary We want to be able to use slurm and mpi such that each rank/task uses 1 gpu, but the job can spread tasks/ranks among the 4 gpus.总结我们希望能够使用 slurm 和 mpi,这样每个 rank/task 使用 1 gpu,但作业可以在 4 gpus 之间传播任务/ranks。 Currently it appears we are limited to device 0 only.目前看来我们仅限于设备 0。 We also want to avoid multiple srun submissions within an salloc/sbatch due to mpi usage.由于使用了 mpi,我们还希望避免在 salloc/sbatch 中多次提交 srun。

OS: CentOS 7操作系统:CentOS 7

Slurm version: 16.05.6 Slurm 版本:16.05.6

Are we forced to use wrapper based methods for this?我们是否被迫为此使用基于包装器的方法

Are there differences with slurm version (14 to 16) in how gpus are allocated? slurm 版本(14 到 16)在 GPU 的分配方式上是否存在差异?

Thank you!谢谢!

Reference : gres.conf参考:gres.conf

Name=gpu File=/dev/nvidia0
Name=gpu File=/dev/nvidia1
Name=gpu File=/dev/nvidia2
Name=gpu File=/dev/nvidia3

First of all, try requesting four GPUs with首先,尝试请求四个 GPU

% salloc -n 4 -c 2 -gres=gpu:4

With --gres=gpu:1 , it is the expected behaviour that all tasks see only one GPU.使用--gres=gpu:1 ,所有任务只能看到一个 GPU 是预期的行为。 With --gres=gpu:4 , the output would be使用--gres=gpu:4 ,输出将是

CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3

To get what you want, you can use a wrapper script, or modify your srun command like this:为了得到你想要的,你可以使用一个包装脚本,或者像这样修改你的 srun 命令:

srun bash -c 'CUDA_VISIBLE_DEVICES=$SLURM_PROCID env' | grep CUDA

then you will get然后你会得到

CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=1
CUDA_VISIBLE_DEVICES=2
CUDA_VISIBLE_DEVICES=3

This feature is planned for 19.05.此功能计划在 19.05 推出。 See https://bugs.schedmd.com/show_bug.cgi?id=4979 for details.有关详细信息,请参阅https://bugs.schedmd.com/show_bug.cgi?id=4979

Be warned that the 'srun bash...' solution suggested will break if your job doesn't request all GPUs on that node, because another process may be in control of GPU0.请注意,如果您的作业没有请求该节点上的所有 GPU,建议的“srun bash...”解决方案将中断,因为另一个进程可能控制 GPU0。

To accomplish one GPU per task you need to use the --gpu-bind switch of the srun command.要为每个任务完成一个 GPU,您需要使用srun命令的--gpu-bind开关。 For example, if I have three nodes with 8 GPUs each and I wish to run eight tasks per node each bound to a unique GPU, the following command would do the trick:例如,如果我有三个节点,每个节点有 8 个 GPU,并且我希望每个节点运行 8 个任务,每个任务都绑定到一个唯一的 GPU,则以下命令可以解决问题:

srun -p gfx908_120 -n 24 -G gfx908_120:24 --gpu-bind=single:1  -l bash -c 'echo $(hostname):$ROCR_VISIBLE_DEVICES'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM