简体   繁体   English

是否需要清除 PyTorch 中的 GPU 张量?

[英]Is it required to clear GPU tensors in PyTorch?

I am new to PyTorch, and I am exploring the functionality of .to() method.我是 PyTorch 的新手,我正在探索 .to .to()方法的功能。 As per the documentation for the CUDA tensors, I see that it is possible to transfer the tensors between the CPU and GPU memory.根据 CUDA 张量的文档,我发现可以在 CPU 和 GPU memory 之间传输张量。

# let us run this cell only if CUDA is available
if torch.cuda.is_available():

    # creates a LongTensor and transfers it to GPU as torch.cuda.LongTensor
    a = torch.full((10,), 3, device=torch.device("cuda"))
    # transfers it to CPU, back to being a torch.LongTensor
    b = a.to(torch.device("cpu"))

In this context, I would like to know if it always necessary to transfer back the tensors from the GPU to CPU, perhaps to free the GPU memory?在这种情况下,我想知道是否总是需要将张量从 GPU 转移回 CPU,也许是为了释放 GPU memory? Doesn't, the runtime automatically clear the GPU memory?不是,运行时会自动清除 GPU memory 吗?

Apart from its use of transferring data between CPU and GPU, I would like to know the recommended usage for .to() method (from the memory perspective).除了使用在 CPU 和 GPU 之间传输数据之外,我想知道 .to .to()方法的推荐用法(从 memory 的角度来看)。 Thanks in advance.提前致谢。

In this context, I would like to know if it always necessary to transfer back the tensors from the GPU to CPU, perhaps to free the GPU memory?在这种情况下,我想知道是否总是需要将张量从 GPU 转移回 CPU,也许是为了释放 GPU memory?

No, it's not always necessary.不,这并不总是必要的。 Memory should be freed when there are no more references to GPU tensor.当不再引用 GPU 张量时,应释放 Memory。 Tensor should be cleared automatically in this case:在这种情况下,张量应该被自动清除:

def foo():
    my_tensor = torch.tensor([1.2]).cuda()
    return "whatever"

smth = foo()

but it won't in this case:但在这种情况下不会:

def bar():
    return torch.tensor([1.2]).cuda()

tensor = bar()

In the second case (tensor being passed around, possibly accumulated or added to list), you should cast it to CPU in order not to waste GPU memory.在第二种情况下(张量被传递,可能累积或添加到列表中),您应该将其转换为 CPU,以免浪费 GPU memory。

Apart from its use of transferring data between CPU and GPU, I would like to know the recommended usage for .to() method (from the memory perspective)除了使用在 CPU 和 GPU 之间传输数据之外,我想知道 .to() 方法的推荐用法(从 memory 的角度来看)

Not sure what you mean here.不知道你在这里的意思。 What you should be after is the least to calls as they require copying the array (O(n) complexity), but shouldn't be too costly anyway (in comparison to pushing data through neural network for example) and probably not worth to be too hardcore about this micro-optimization.你应该追求的是最少to调用,因为它们需要复制数组(O(n)复杂度),但无论如何都不应该太昂贵(例如与通过神经网络推送数据相比)并且可能不值得对这种微优化太铁杆了。

Usually data loading is done on CPU (transformations, augmentations) and each batch copied to GPU (possibly with pinned memory) just before it is passed to neural network.通常数据加载是在 CPU 上完成的(转换、增强),每个批次在传递到神经网络之前复制到 GPU(可能带有固定内存)。

Also, as of 1.5.0 release, pytorch provides memory_format argument in .to method.此外,从1.5.0版本开始,pytorch 在.to方法中提供memory_format参数。 This allows you to specify whether (N, C, H, W) (PyTorch default) or channel last (N, H, W, C) should be used for tensor and model (convolutional models with torch.nn.Conv2d to be precise).这允许您指定是否(N, C, H, W) (PyTorch 默认)或通道最后(N, H, W, C)应用于张量和 model(准确地说是带有torch.nn.Conv2d的卷积模型)。 This could further speed up your models (for torchvision midels speedup of 16% was reported IIRC), see here for more info and usage.这可以进一步加速您的模型(据 IIRC 报道,torchvision midels 加速了 16%),请参阅此处了解更多信息和用法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM