简体   繁体   English

PyTorch等深度学习框架在使用多个GPU时如何处理内存?

[英]How do deep learning frameworks such as PyTorch handle memory when using multiple GPUs?

I have recently run into a situation where I am running out of memory on a single Nvidia V100. 我最近遇到了一个单个Nvidia V100内存不足的情况。 I have limited experience using multiple GPUs to train networks so I'm a little unsure on how the data parallelization process works. 我使用多个GPU来训练网络的经验有限,所以我对数据并行化过程的工作方式有些不确定。 Lets say I'm using a model and batch size that requires something like 20-25GB of memory. 假设我使用的是模型和批量大小,需要20-25GB的内存。 Is there any way to take advantage of the full 32GB of memory I have between two 16GB V100s? 有没有办法充分利用两个16GB V100之间的全部32GB内存? Would PyTorch's DataParallel functionality achieve this? PyTorch的DataParallel功能会实现吗? I suppose there is also the possibility of breaking the model up and using model parallelism as well. 我想也有可能打破模型并使用模型并行性。 Please excuse my lack of knowledge on this subject. 请原谅我对这个问题缺乏了解。 Thanks in advance for any help or clarification! 在此先感谢您的帮助或澄清!

You should keep model parallelism as your last resource and only if your model doesn't fit in the memory of a single GPU (with 16GB/GPU you have plenty of room for a gigantic model). 你应该保持模型并行性作为你的最后一个资源,并且只有你的模型不适合单个GPU的内存(16GB / GPU你有足够的空间容纳一个巨大的模型)。

If you have two GPUs, I would use data parallelism. 如果你有两个GPU,我会使用数据并行。 In data parallelism you have a copy of your model on each GPU and each copy is fed with a batch. 在数据并行性方面,您可以在每个GPU上获得模型的副本,并为每个副本提供批处理。 The gradients are then gathered and used to update the copies. 然后收集渐变并用于更新副本。

Pytorch makes it really easy to achieve data parallelism, as you just need to wrap you model instance in nn.DataParallel : Pytorch使得实现数据并行化变得非常容易,因为您只需要在nn.DataParallel包装模型实例:

model = torch.nn.DataParallel(model, device_ids=[0, 1])
output = model(input_var)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 当某些书籍具有多种类型时,如何按类型对书籍进行分类(使用深度学习)? - How do you classify books by genre (using deep learning) when some books have multiple genres? 训练深度网络时如何有效使用多个GPU? - How to use multiple GPUs effectively when training deep networks? 如何跟踪CPU使用时间与GPU进行深度学习? - How do I keep track of the time the CPU is used vs the GPUs for deep learning? 在GPU上运行多个深度学习模型时如何进行优化 - How to optimal when run multiple deep learning model on GPU PyTorch:使用多个 GPU 对单个非常大的图像进行推断? - PyTorch: Inference on a single very large image using multiple GPUs? 如何使用Keras使用多个GPU训练GAN? - How do you train GANs using multiple GPUs with Keras? 如何处理深度学习中的图像大小变化? - How to handle image size variation in Deep Learning? 无法解析arguments(使用pytorch的深度学习教程) - Can't parse arguments (deep learning tutorial using pytorch) 如何将深度学习模型从 MATLAB 导入到 PyTorch? - How to import deep learning models from MATLAB to PyTorch? 如何将 Pytorch 的深度学习可训练参数分解为两部分? - How to decompose deep learning trainable parameter into two parts for Pytorch?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM