[英]How to use multiple GPUs in pytorch?
I use this command to use a GPU.我用这个命令来使用一个GPU。
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
But, I want to use two GPUs in jupyter , like this:但是,我想在jupyter中使用两个 GPU,如下所示:
device = torch.device("cuda:0,1" if torch.cuda.is_available() else "cpu")
Using multi-GPUs is as simply as wrapping a model in DataParallel
and increasing the batch size.使用多 GPU 就像在DataParallel
包装模型并增加批量大小一样简单。 Check these two tutorials for a quick start:查看这两个教程以快速入门:
Assuming that you want to distribute the data across the available GPUs (If you have batch size of 16, and 2 GPUs, you might be looking providing the 8 samples to each of the GPUs), and not really spread out the parts of models across difference GPU's.假设您想将数据分布在可用的 GPU 上(如果您的批次大小为 16 和 2 个 GPU,您可能希望为每个 GPU 提供 8 个样本),而不是真正将模型的各个部分分布在GPU的区别。 This can be done as follows:这可以按如下方式完成:
If you want to use all the available GPUs:如果您想使用所有可用的 GPU:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = CreateModel()
model= nn.DataParallel(model)
model.to(device)
If you want to use specific GPUs: (For example, using 2 out of 4 GPUs)如果您想使用特定的 GPU:(例如,使用 4 个 GPU 中的 2 个)
device = torch.device("cuda:1,3" if torch.cuda.is_available() else "cpu") ## specify the GPU id's, GPU id's start from 0.
model = CreateModel()
model= nn.DataParallel(model,device_ids = [1, 3])
model.to(device)
To use the specific GPU's by setting OS environment variable:要通过设置操作系统环境变量来使用特定 GPU:
Before executing the program, set CUDA_VISIBLE_DEVICES
variable as follows:在执行程序之前,设置CUDA_VISIBLE_DEVICES
变量如下:
export CUDA_VISIBLE_DEVICES=1,3
(Assuming you want to select 2nd and 4th GPU) export CUDA_VISIBLE_DEVICES=1,3
(假设您要选择第 2 个和第 4 个 GPU)
Then, within program, you can just use DataParallel()
as though you want to use all the GPUs.然后,在程序中,您可以像使用所有 GPU 一样使用DataParallel()
。 (similar to 1st case). (类似于第一种情况)。 Here the GPUs available for the program is restricted by the OS environment variable.这里可供程序使用的 GPU 受操作系统环境变量的限制。
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = CreateModel()
model= nn.DataParallel(model)
model.to(device)
In all of these cases, the data has to be mapped to the device.在所有这些情况下,数据都必须映射到设备。
If X
and y
are the data:如果X
和y
是数据:
X.to(device)
y.to(device)
Another option would be to use some helper libraries for PyTorch:另一种选择是为 PyTorch 使用一些帮助程序库:
In there there is a concept of context manager for distributed configuration on:其中有一个用于分布式配置的上下文管理器的概念:
This is of possible the best option IMHO to train on CPU/GPU/TPU without changing your original PyTorch code.恕我直言, 这可能是在不更改原始 PyTorch 代码的情况下在 CPU/GPU/TPU 上训练的最佳选择。
Worth cheking Catalyst for similar distributed GPU options.对于类似的分布式 GPU 选项,值得一试Catalyst 。
In 2022, PyTorch says: 2022 年,PyTorch 说:
It is recommended to use DistributedDataParallel, instead of this class, to do multi-GPU training, even if there is only a single node.建议使用 DistributedDataParallel,而不是这个 class,来做多 GPU 训练,即使只有一个节点。 See: Use nn.parallel.DistributedDataParallel instead of multiprocessing or nn.DataParallel and Distributed Data Parallel.请参阅:使用 nn.parallel.DistributedDataParallel 而不是多处理或 nn.DataParallel 和分布式数据并行。
in https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel在https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#torch.nn.DataParallel
Thus, seems that we should use DistributedDataParallel
, not DataParallel
.因此,似乎我们应该使用DistributedDataParallel
,而不是DataParallel
。
When I ran naiveinception_googlenet, the above methods didn't work for me.当我运行 naiveinception_googlenet 时,上述方法对我不起作用。 The following method solved my problem.以下方法解决了我的问题。
import os导入操作系统
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0,3" # specify which GPU(s) to be used os.environ["CUDA_VISIBLE_DEVICES"]="0,3" # 指定要使用的 GPU
If you want to run your code only on specific GPUs (eg only on GPU id 2 and 3), then you can specify that using the CUDA_VISIBLE_DEVICES=2,3 variable when triggering the python code from terminal.如果您只想在特定 GPU 上运行您的代码(例如,仅在 GPU id 2 和 3 上),那么您可以在从终端触发 python 代码时指定使用 CUDA_VISIBLE_DEVICES=2,3 变量。
CUDA_VISIBLE_DEVICES=2,3 python lstm_demo_example.py --epochs=30 --lr=0.001
and inside the code, leave it as:在代码中,将其保留为:
device = torch.device("cuda" if torch.cuda.is_available() else 'cpu')
model = LSTMModel()
model = nn.DataParallel(model)
model = model.to(device)
Source : https://glassboxmedicine.com/2020/03/04/multi-gpu-training-in-pytorch-data-and-model-parallelism/资料来源: https ://glassboxmedicine.com/2020/03/04/multi-gpu-training-in-pytorch-data-and-model-parallelism/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.