如何进行更快的深度强化学习训练

Question

As you know, Deep Reinforcement Learning (DRL) training could take more than 10 days using single CPU.如您所知，使用单个 CPU 进行深度强化学习 (DRL) 训练可能需要 10 天以上的时间。 Using parallel execution tools (such as CUDA), the training time decreases up to 1 day (depending on the CPU and GPU features).使用并行执行工具（如 CUDA），训练时间最多减少 1 天（取决于 CPU 和 GPU 功能）。 But when using CUDA, GPU usage is around 10% and the training time is still too long.但是使用CUDA时，GPU的使用率在10%左右，训练时间还是太长了。 This is so disturbing for developers who want to check the result frequently while developing their code.对于想要在开发代码时经常检查结果的开发人员来说，这非常令人不安。 What do you recommend to decrease the training time as much as possible, in terms of coding tips, building the model, setting, GPU hardware etc.在编码技巧、构建 model、设置、GPU 硬件等方面，您有什么建议尽可能减少训练时间。

Answer 1

From thedocs :从文档：

By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process默认情况下，TensorFlow 映射几乎所有对进程可见的所有 GPU 的 GPU memory（受 CUDA_VISIBLE_DEVICES 约束）

So you shouldn't have to change any setting to allow more GPU usage.因此，您不必更改任何设置以允许更多 GPU 使用。 The quickest thing to check therefore could be whether the batch size is large enough - you might simply not be using the available memory to its fullest extent.因此，最快的检查可能是batch size是否足够大 - 您可能根本没有充分利用可用的 memory。 Try increasing batch size to a point where you get an OOM error, and then scale it back a bit so it works.尝试将批量大小增加到出现 OOM 错误的程度，然后将其缩小一点以使其正常工作。

If you have access to multiple GPUs you can make use of distributed strategies in tensorflow to make sure all GPUs are being used:如果您可以访问多个 GPU，则可以使用 tensorflow 中的分布式策略来确保所有 GPU 都在使用：

mirrored_strategy = tf.distribute.MirroredStrategy()
with mirrored_strategy.scope():
       <your model training code>

See the docs here在此处查看文档

Mirrored strategy is used for synchronous distributed training across multiple GPUs on a single server.镜像策略用于在单个服务器上跨多个 GPU 进行同步分布式训练。 There's also a more intuitive explanation in this blog . 这个博客里也有更直观的解释。

Finally, for more efficient processing you can alter the datatype of the inter-model parameters by using mixed precision .最后，为了更有效的处理，您可以使用混合精度来更改模型间参数的数据类型。

如何进行更快的深度强化学习训练

问题描述

1 个解决方案

解决方案1
0 2021-04-29 09:37:43

如何进行更快的深度强化学习训练

问题描述

1 个解决方案

解决方案1 0 2021-04-29 09:37:43

解决方案1
0 2021-04-29 09:37:43