PyTorch: What is the difference between tensor.cuda() and tensor.to(torch.device("cuda:0"))?

Question

In PyTorch, what is the difference between the following two methods in sending a tensor (or model) to GPU:

Setup:

X = np.array([[1, 3, 2, 3], [2, 3, 5, 6], [1, 2, 3, 4]]) # X = model()
X = torch.DoubleTensor(X)

Method 1	Method 2
`X.cuda()`	`device = torch.device("cuda:0")` `X = X.to(device)`

(I don't really need a detailed explanation of what is happening in the backend, just want to know if they are both essentially doing the same thing)

Answer 1

There is no difference between the two.
Early versions of pytorch had .cuda() and .cpu() methods to move tensors and models from cpu to gpu and back. However, this made code writing a bit cumbersome:

if cuda_available:
  x = x.cuda()
  model.cuda()
else:
  x = x.cpu()
  model.cpu()

Later versions introduced .to() that basically takes care of everything in an elegant way:

device = torch.device('cuda') if cuda_available else torch.device('cpu')
x = x.to(device)
model = model.to(device)

Answer 2

Their syntax varies slightly, but they are equivalent :

⠀	.to(name)	.to(device)	.cuda()
CPU	`to('cpu')`	`to(torch.device('cpu'))`	`cpu()`
Current GPU	`to('cuda')`	`to(torch.device('cuda'))`	`cuda()`
Specific GPU	`to('cuda:1')`	`to(torch.device('cuda:1'))`	`cuda(device=1)`

Note: the current cuda device is 0 by default, but this can be set with torch.cuda.set_device() .

PyTorch: What is the difference between tensor.cuda() and tensor.to(torch.device("cuda:0"))?

Question

2 answers

solution1
10 ACCPTED 2020-07-15 06:06:00

solution2
2 2021-03-10 16:24:00

PyTorch: What is the difference between tensor.cuda() and tensor.to(torch.device("cuda:0"))?

Question

2 answers

solution1 10 ACCPTED 2020-07-15 06:06:00

solution2 2 2021-03-10 16:24:00

solution1
10 ACCPTED 2020-07-15 06:06:00

solution2
2 2021-03-10 16:24:00