简体   繁体   中英

PyTorch: What is the difference between tensor.cuda() and tensor.to(torch.device("cuda:0"))?

In PyTorch, what is the difference between the following two methods in sending a tensor (or model) to GPU:

Setup:

X = np.array([[1, 3, 2, 3], [2, 3, 5, 6], [1, 2, 3, 4]]) # X = model()
X = torch.DoubleTensor(X)
Method 1 Method 2
X.cuda() device = torch.device("cuda:0")
X = X.to(device)

(I don't really need a detailed explanation of what is happening in the backend, just want to know if they are both essentially doing the same thing)

There is no difference between the two.
Early versions of pytorch had .cuda() and .cpu() methods to move tensors and models from cpu to gpu and back. However, this made code writing a bit cumbersome:

if cuda_available:
  x = x.cuda()
  model.cuda()
else:
  x = x.cpu()
  model.cpu()

Later versions introduced .to() that basically takes care of everything in an elegant way:

device = torch.device('cuda') if cuda_available else torch.device('cpu')
x = x.to(device)
model = model.to(device)

Their syntax varies slightly, but they are equivalent :

.to(name) .to(device) .cuda()
CPU to('cpu') to(torch.device('cpu')) cpu()
Current GPU to('cuda') to(torch.device('cuda')) cuda()
Specific GPU to('cuda:1') to(torch.device('cuda:1')) cuda(device=1)

Note: the current cuda device is 0 by default, but this can be set with torch.cuda.set_device() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM