[英]What is the difference between detach, clone and deepcopy in Pytorch tensors in detail?
I've been struggling to understand the differences between .clone()
, .detach()
and copy.deepcopy
when using Pytorch.在使用 Pytorch 时,我一直在努力理解
.clone()
、 .detach .detach()
和copy.deepcopy
之间的区别。 In particular with Pytorch tensors.特别是 Pytorch 张量。
I tried writing all my question about their differences and uses cases and became overwhelmed quickly and realized that perhaps have the 4 main properties of Pytorch tensors would clarify much better which one to use that going through every small question.我试着写下我所有关于它们的差异和用例的问题,很快就不知所措,并意识到也许拥有 Pytorch 张量的 4 个主要属性会更好地阐明使用哪一个来解决每个小问题。 The 4 main properties I realized one needs keep track are:
我意识到需要跟踪的 4 个主要属性是:
require_grads
, shape, is_leaf
, etc.)require_grads
、 shape 、 is_leaf
等) According to what mined out from the Pytorch forums and the documentation this is my current distinctions for each when used on tensors:根据从 Pytorch 论坛和文档中挖掘出来的内容,这是我目前在张量上使用时的区别:
For clone:对于克隆:
x_cloned = x.clone()
I believe this is how it behaves according to the main 4 properties:我相信这就是它根据主要 4 个属性的行为方式:
x_cloned
has it's own python reference/pointer to the new objectx_cloned
有它自己的 python 引用/指向新 object 的指针x_new
with the same data as x
x_new
,其数据与x
相同clone
operation as .grad_fn=<CloneBackward>
clone
操作包含为.grad_fn=<CloneBackward>
it seems that the main use of this as I understand is to create copies of things so that inplace_
operations are safe.据我所知,它的主要用途似乎是创建事物的副本,以便
inplace_
操作是安全的。 In addition coupled with .detach
as .detach().clone()
(the "better" order to do it btw) it creates a completely new tensor that has been detached with the old history and thus stops gradient flow through that path.除了将
.detach
与.detach().clone()
相结合(顺便说一句,这样做的“更好”顺序),它创建了一个全新的张量,该张量已与旧历史分离,从而阻止梯度流通过该路径。
x_detached = x.detach()
x_new = x
of course).x_new = x
的参考)。 One can use id
for this one I believeid
x_detached
with the same data as xx_detached
与 x 相同的数据 I believe the only sensible use I know of is of creating new copies with it's own memory when coupled with .clone()
as .detach().clone()
.我相信我所知道的唯一明智的用途是使用它自己的 memory 与
.clone()
作为 .detach( .detach().clone()
结合创建新副本。 Otherwise, I am not sure what the use it.否则,我不确定它有什么用。 Since it points to the original data, doing in place ops might be potentially dangerous (since it changes the old data but the change to the old data is NOT known by autograd in the earlier computation graph).
由于它指向原始数据,因此执行就地操作可能存在潜在危险(因为它会更改旧数据,但在较早的计算图中,autograd不知道对旧数据的更改)。
x_deepcopy = copy.deepcopy(x)
I don't really see a use case for this.我真的没有看到这个用例。 I assume anyone trying to use this really meant 1)
.detach().clone()
or just 2) .clone()
by itself, depending if one wants to stop gradient flows to the earlier graph with 1 or if they want just to replicate the data with a new memory 2).我假设任何尝试使用它的人实际上意味着 1)
.detach().clone()
或只是 2) .clone()
本身,这取决于一个人是想用 1 停止梯度流到较早的图,还是他们只想使用新的 memory 2) 复制数据。
So this is the best way I have to understand the differences as of now rather than ask all the different scenarios that one might use them.因此,这是我目前必须了解差异的最佳方式,而不是询问可能使用它们的所有不同场景。
So is this right?那么这是对的吗? Does anyone see any major flaw that needs to be correct?
有没有人看到任何需要纠正的重大缺陷?
My own worry is about the semantics I gave to deep copy and wonder if it's correct wrt the deep copying the history.我自己担心的是我赋予深度复制的语义,并想知道深度复制历史是否正确。
I think a list of common use cases for each would be wonderful.我认为每个常见用例的列表会很棒。
these are all the resources I've read and participated to arrive at the conclusions in this question:这些是我阅读并参与得出此问题结论的所有资源:
torch.clone()
Copies the tensor while maintaining a link in the autograd graph.复制张量,同时在 autograd 图中保持链接。 To be used if you want to eg duplicate a tensor as an operation in a neural network.
如果您想将张量复制为神经网络中的操作,请使用此选项。
Returns a copy of input.
返回输入的副本。
NOTE: This function is differentiable, so gradients will flow back from the result of this operation to input.
注意:这个 function 是可微的,所以梯度会从这个操作的结果流回输入。 To create a tensor without an autograd relationship to input see
detach()
.要创建一个没有 autograd 关系的张量来输入,请参见
detach()
。
torch.tensor.detach()
Returns a view of the original tensor without the autograd history.返回没有 autograd 历史的原始张量的视图。 To be used if you want to manipulate the values of a tensor (not in place) without affecting the computational graph (eg reporting values midway through the forward pass).
如果您想在不影响计算图的情况下操纵张量的值(未到位)(例如,在前向传递中途报告值),则使用该值。
Returns a new Tensor, detached from the current graph.
返回一个新的张量,与当前图表分离。
The result will never require gradient.
结果永远不需要渐变。
This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.
此方法也会影响正向模式 AD 梯度,结果将永远不会有正向模式 AD 梯度。
NOTE: Returned Tensor shares the same storage with the original one.
注意:返回的张量与原始张量共享相同的存储。 In-place modifications on either of them will be seen, and may trigger errors in correctness checks.
将看到对其中任何一个的就地修改,并且可能会触发正确性检查中的错误。 1
1
copy.deepcopy
deepcopy
is a generic python function from the copy
library which makes a copy of an existing object (recursively if the object itself contains objects). deepcopy
是来自copy
库的通用 python function 复制现有 object (递归如果 ZA86Z66DE6331C4B666 对象)。
This is used (as opposed to more usual assignment) when the underlying object you wish to make a copy of is mutable (or contains mutables) and would be susceptible to mirroring changes made in one:当您希望复制的底层 object 是可变的(或包含可变的)并且容易受到镜像更改的影响时,使用此方法(而不是更常见的分配):
Assignment statements in Python do not copy objects, they create bindings between a target and an object.
Python 中的赋值语句不复制对象,它们在目标和 object 之间创建绑定。 For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
对于可变或包含可变项的 collections,有时需要一个副本,以便可以更改一个副本而不更改另一个副本。
In a PyTorch setting, as you say, if you want a fresh copy of a tensor object to use in a completely different setting with no relationship or effect on its parent, you should use .detach().clone()
.正如您所说,在 PyTorch 设置中,如果您想要张量 object 的新副本在完全不同的设置中使用,而对其父级没有关系或影响,您应该使用
.detach().clone()
。
IMPORTANT NOTE: Previously, in-place size / stride / storage changes (such as
resize_
/resize_as_
/set_
/transpose_
) to the returned tensor also update the original tensor.重要说明:以前,对返回张量的就地大小/步幅/存储更改(例如
resize_
/resize_as_
/set_
/transpose_
)也会更新原始张量。 Now, these in-place changes will not update the original tensor anymore, and will instead trigger an error.现在,这些就地更改将不再更新原始张量,而是会触发错误。 For sparse tensors: In-place indices / values changes (such as
zero_
/copy_
/add_
) to the returned tensor will not update the original tensor anymore, and will instead trigger an error.对于稀疏张量:返回张量的就地索引/值更改(例如
zero_
/copy_
/add_
)将不再更新原始张量,而是会触发错误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.