繁体 English 中英

使用 Pytorch 在同一台机器上的多个 GPU 上训练 model 时，批量大小如何划分？

[英]When training a model over multiple GPUs on the same machine using Pytorch, how is the batch size divided?

原文 2023-01-23 14:45:29 6 1 pytorch/ backpropagation/ multi-gpu

即使浏览 Pytorch 论坛，我仍然不确定这个。 假设我正在使用 Pytorch DDP在同一台机器上的4 GPU 上训练 model。

假设我选择的批量大小为8 。 model 理论上是每一步反向传播2示例，我们看到的最终结果是针对批大小为2的 model 训练的，还是 model 在每一步都收集梯度以从每个 GPU 获得结果并反向传播批量大小为8 ?

1 个解决方案

实际批量大小是您提供给每个工作人员的输入大小，在您的情况下为 8。换句话说，BP 每 8 个示例运行一次。

具体代码示例： https://gist.github.com/sgraaf/5b0caa3a320f28c27c12b5efeb35aa4c#file-ddp_example-py-L63 。 这是批量大小。

在PyTorch培训之外使用多个GPU

[英]Using Multiple GPUs outside of training in PyTorch

当 batch_size 不是训练数据大小的倍数时，Keras 和 Pytorch 如何（如何）处理最后一批？

[英]HOW (how) does Keras and Pytorch handle the last batch when batch_size is not a multiple of size of training data?

在 GPU 上训练多个 pytorch 模型

[英]Training multiple pytorch models on GPUs

如何在pytorch中用多个GPU训练model？

[英]How to train model with multiple GPUs in pytorch？

当示例数未完全除以批处理大小时，Pytorch DataLoader失败

[英]Pytorch DataLoader fails when the number of examples are not exactly divided by the batch size

如何在 Pytorch 后期训练中减少 model 的大小

[英]How to reduce model size in Pytorch post training

如何在 pytorch 中使用多个 GPU？

[英]How to use multiple GPUs in pytorch?

PyTorch等深度学习框架在使用多个GPU时如何处理内存？

[英]How do deep learning frameworks such as PyTorch handle memory when using multiple GPUs?

跨多个进程训练模型时，在 PyTorch 中使用 tensor.share_memory_() 与 multiprocessing.Queue

[英]Using tensor.share_memory_() vs multiprocessing.Queue in PyTorch when training model across multiple processes

如何在多个 GPU 的 Pytorch 示例中利用 DistributedDataParallel 的世界大小参数？

[英]How to leverage the world-size parameter for DistributedDataParallel in Pytorch example for multiple GPUs?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在PyTorch培训之外使用多个GPU 当 batch_size 不是训练数据大小的倍数时，Keras 和 Pytorch 如何（如何）处理最后一批？在 GPU 上训练多个 pytorch 模型如何在pytorch中用多个GPU训练model？当示例数未完全除以批处理大小时，Pytorch DataLoader失败如何在 Pytorch 后期训练中减少 model 的大小如何在 pytorch 中使用多个 GPU？ PyTorch等深度学习框架在使用多个GPU时如何处理内存？跨多个进程训练模型时，在 PyTorch 中使用 tensor.share_memory_() 与 multiprocessing.Queue 如何在多个 GPU 的 Pytorch 示例中利用 DistributedDataParallel 的世界大小参数？

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM