I am having a list of cuda tensors:
>>> X_train
[tensor([ 101, 3533...='cuda:0'), tensor([ 101, 3422...='cuda:0'), tensor([ 101, 2054...='cuda:0'), tensor([ 101, 1019, ...='cuda:0'), tensor([ 101, 14674...='cuda:0'), tensor([ 101, 9246...='cuda:0'), tensor([ 101, 2054...='cuda:0'), tensor([ 101, 2339...='cuda:0), ... ]
I am trying to apply k-fold cross validation. So I want to index this list using list of k-fold indices:
>>> X_train[train_index]
But it gives me error:
TypeError: only integer scalar arrays can be converted to a scalar index
As per this answer, the issue is that I cannot index list using list of indices. It is allowed in numpy.
So I tried to convert it to numpy:
np.array(X_train)
But it gave me error:
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Do I really have to move these individual cuda tensors to cpu before indexing? How can I easily index this list of cuda tensors while keeping them on cuda (to utilize GPU for training model)? Or its not possible and I should first index them (by first forming them as numpy array) and then move the indexed ones to cuda? Is there any stadard / preferrable practice followed to handle the data?
Converting to numpy just for indexing is really a bad option: This incurs copy operations between GPU and CPU and in general can disrupt PyTorch's computational graph .
What you can do instead is uselist comprehension :
subset = [X_train[i_] for i_ in train_index]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.