繁体 English 中英

如何使用同一个 GPU 设备在 SLURM 中定义多个 gres 资源？

[英]How to define multiple gres resources in SLURM using the same GPU device?

原文 2021-12-02 23:10:36 6 2 tensorflow/ gpu/ slurm

我正在运行使用非常少的 GPU memory 的机器学习 (ML) 作业。 因此，我可以在单个 GPU 上运行多个 ML 作业。

为此，我想在 gres.conf 文件中添加多行来指定相同的设备。 但是，似乎 slurm 守护进程不接受这一点，服务返回：

fatal: Gres GPU plugin failed to load configuration

我是否缺少任何选项来完成这项工作？

或者也许是使用 SLURM 实现这一目标的不同方式？

它与这个有点相似，但这个似乎特定于某些启用编译的 CUDA 代码。 似乎比我的一般情况（或至少据我了解）更具体的东西。 如何使用 SLURM 在带有 CUDA 的 GPU 网格上运行多个作业

2 个解决方案

我不认为你可以超额订阅 GPU，所以我看到了两个选项：

您可以配置CUDA 多进程服务或
将多个计算打包到具有一个 GPU 的单个作业中并并行运行它们。

除了@Marcus Boden 提到的与V100 类型卡相关的nVidia MPS，还有与A100 类型卡相关的Multi-Instance GPU 。

如何在 GPU 上定义具有多种返回类型的 tf.map_fn？

[英]How to define tf.map_fn with multiple return types on GPU?

Keras/Tensorflow：在同一个 GPU 上循环或使用 Process 训练多个模型

[英]Keras/Tensorflow: Train multiple models on the same GPU in a loop or using Process

tensorflow同时使用2个GPU

[英]tensorflow using 2 GPU at the same time

在 Tensorflow 2.3 和 Keras 中使用具有多个嵌入输入的 GPU 时无法分配设备进行操作

[英]Cannot assign a device for operation when using GPU with multiple embedding inputs in Tensorflow 2.3 with Keras

具有多个gpu的TensorFlow XLA不会同时使用GPU

[英]TensorFlow XLA with multiple gpu does not use GPU at the same time

聪明人，如何选择单GPU设备？

[英]Cleverhans, how to select single GPU device?

在同一GPU上运行多个tensorflow进程是不安全的吗？

[英]Is it unsafe to run multiple tensorflow processes on the same GPU?

如何使用单个 GPU 在 tensorflow python 中同时运行多个模型？

[英]How can I use single GPU to run multiple models at the same time in tensorflow python?

当设备设置为CPU时，为什么TensorFlow使用我的GPU

[英]Why is TensorFlow using my GPU when the device is set to the CPU

Tensorflow with GPU，如何查看tensorflow是否使用GPU？

[英]Tensorflow with GPU, how to see tensorflow is using the GPU?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 GPU 上定义具有多种返回类型的 tf.map_fn？ Keras/Tensorflow：在同一个 GPU 上循环或使用 Process 训练多个模型 tensorflow同时使用2个GPU 在 Tensorflow 2.3 和 Keras 中使用具有多个嵌入输入的 GPU 时无法分配设备进行操作具有多个gpu的TensorFlow XLA不会同时使用GPU 聪明人，如何选择单GPU设备？在同一GPU上运行多个tensorflow进程是不安全的吗？如何使用单个 GPU 在 tensorflow python 中同时运行多个模型？当设备设置为CPU时，为什么TensorFlow使用我的GPU Tensorflow with GPU，如何查看tensorflow是否使用GPU？

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM