简体   繁体   English

Pytorch 张量中的分离、克隆和深度复制有什么区别?

[英]What is the difference between detach, clone and deepcopy in Pytorch tensors in detail?

I've been struggling to understand the differences between .clone() , .detach() and copy.deepcopy when using Pytorch.在使用 Pytorch 时,我一直在努力理解.clone() 、 .detach .detach()copy.deepcopy之间的区别。 In particular with Pytorch tensors.特别是 Pytorch 张量。

I tried writing all my question about their differences and uses cases and became overwhelmed quickly and realized that perhaps have the 4 main properties of Pytorch tensors would clarify much better which one to use that going through every small question.我试着写下我所有关于它们的差异和用例的问题,很快就不知所措,并意识到也许拥有 Pytorch 张量的 4 个主要属性会更好地阐明使用哪一个来解决每个小问题。 The 4 main properties I realized one needs keep track are:我意识到需要跟踪的 4 个主要属性是:

  1. if one has a new pointer/reference to a tensor如果一个人有一个对张量的新指针/引用
  2. if one has a new tensor object instance (and thus most likely this new instance has it's own meta-data like require_grads , shape, is_leaf , etc.)如果一个人有一个新的张量 object 实例(因此这个新实例很可能有它自己的元数据,如require_grads 、 shape 、 is_leaf等)
  3. if it has allocated a new memory for the tensor data (ie if this new tensor is a view of a different tensor)如果它为张量数据分配了一个新的 memory(即,如果这个新的张量是不同张量的视图)
  4. if it's tracking the history of operations or not (or even if it's tracking a completely new history of operations or the same old one in the case of deep copy)它是否正在跟踪操作历史记录(或者即使它正在跟踪全新的操作历史记录,或者在深拷贝的情况下是相同的旧历史记录)

According to what mined out from the Pytorch forums and the documentation this is my current distinctions for each when used on tensors:根据从 Pytorch 论坛和文档中挖掘出来的内容,这是我目前在张量上使用时的区别:

Clone克隆

For clone:对于克隆:

x_cloned = x.clone()

I believe this is how it behaves according to the main 4 properties:我相信这就是它根据主要 4 个属性的行为方式:

  1. the cloned x_cloned has it's own python reference/pointer to the new object克隆的x_cloned有它自己的 python 引用/指向新 object 的指针
  2. it has created it's own new tensor object instance (with it's separate meta-data)它创建了自己的新张量 object 实例(带有单独的元数据)
  3. it has allocated a new memory for x_new with the same data as x它为 x_new 分配了一个新的x_new ,其数据与x相同
  4. it is keeping track of the original history of operations and in addition included this clone operation as .grad_fn=<CloneBackward>它跟踪原始操作历史记录,此外还将此clone操作包含为.grad_fn=<CloneBackward>

it seems that the main use of this as I understand is to create copies of things so that inplace_ operations are safe.据我所知,它的主要用途似乎是创建事物的副本,以便inplace_操作是安全的。 In addition coupled with .detach as .detach().clone() (the "better" order to do it btw) it creates a completely new tensor that has been detached with the old history and thus stops gradient flow through that path.除了将.detach.detach().clone()相结合(顺便说一句,这样做的“更好”顺序),它创建了一个全新的张量,该张量已与旧历史分离,从而阻止梯度流通过该路径。

Detach分离

x_detached = x.detach()
  1. creates a new python reference (the only one that does not is doing x_new = x of course).创建一个新的 python 参考(当然唯一不做x_new = x的参考)。 One can use id for this one I believe我相信这个可以使用id
  2. it has created it's own new tensor object instance (with it's separate meta-data)它创建了自己的新张量 object 实例(带有单独的元数据)
  3. it has NOT allocated a new memory for x_detached with the same data as x它没有为x_detached分配一个新的x_detached与 x 相同的数据
  4. it cuts the history of the gradients and does not allow it to flow through it.它切断了梯度的历史,不允许它流过它。 I think it's right to think of it as having no history, as a brand new tensor.我认为认为它没有历史,作为一个全新的张量是正确的。

I believe the only sensible use I know of is of creating new copies with it's own memory when coupled with .clone() as .detach().clone() .我相信我所知道的唯一明智的用途是使用它自己的 memory 与.clone()作为 .detach( .detach().clone()结合创建新副本。 Otherwise, I am not sure what the use it.否则,我不确定它有什么用。 Since it points to the original data, doing in place ops might be potentially dangerous (since it changes the old data but the change to the old data is NOT known by autograd in the earlier computation graph).由于它指向原始数据,因此执行就地操作可能存在潜在危险(因为它会更改旧数据,但在较早的计算图中,autograd知道对旧数据的更改)。

copy.deepcopy复制.deepcopy

x_deepcopy = copy.deepcopy(x)
  1. if one has a new pointer/reference to a tensor如果一个人有一个对张量的新指针/引用
  2. it creates a new tensor instance with it's own meta-data (all of the meta-data should point to deep copies, so new objects if it's implemented as one would expect I hope).它使用自己的元数据创建一个新的张量实例(所有元数据都应该指向深层副本,所以如果它按照我希望的那样实现,那么新对象)。
  3. it has it's own memory allocated for the tensor data它有自己的 memory 分配给张量数据
  4. If it truly is a deep copy, I would expect a deep copy of the history.如果它真的是深拷贝,我会期待历史的深拷贝。 So it should do a deep replication of the history.所以它应该对历史进行深度复制。 Though this seems really expensive but at least semantically consistent with what deep copy should be.虽然这看起来确实很昂贵,但至少在语义上与深拷贝应该是一致的。

I don't really see a use case for this.我真的没有看到这个用例。 I assume anyone trying to use this really meant 1) .detach().clone() or just 2) .clone() by itself, depending if one wants to stop gradient flows to the earlier graph with 1 or if they want just to replicate the data with a new memory 2).我假设任何尝试使用它的人实际上意味着 1) .detach().clone()或只是 2) .clone()本身,这取决于一个人是想用 1 停止梯度流到较早的图,还是他们只想使用新的 memory 2) 复制数据。

So this is the best way I have to understand the differences as of now rather than ask all the different scenarios that one might use them.因此,这是我目前必须了解差异的最佳方式,而不是询问可能使用它们的所有不同场景。

So is this right?那么这是对的吗? Does anyone see any major flaw that needs to be correct?有没有人看到任何需要纠正的重大缺陷?

My own worry is about the semantics I gave to deep copy and wonder if it's correct wrt the deep copying the history.我自己担心的是我赋予深度复制的语义,并想知道深度复制历史是否正确。

I think a list of common use cases for each would be wonderful.我认为每个常见用例的列表会很棒。


Resources资源

these are all the resources I've read and participated to arrive at the conclusions in this question:这些是我阅读并参与得出此问题结论的所有资源:

torch.clone()

Copies the tensor while maintaining a link in the autograd graph.复制张量,同时在 autograd 图中保持链接。 To be used if you want to eg duplicate a tensor as an operation in a neural network.如果您想将张量复制为神经网络中的操作,请使用此选项。

Returns a copy of input.返回输入的副本。

NOTE: This function is differentiable, so gradients will flow back from the result of this operation to input.注意:这个 function 是可微的,所以梯度会从这个操作的结果流回输入。 To create a tensor without an autograd relationship to input see detach() .要创建一个没有 autograd 关系的张量来输入,请参见detach()

torch.tensor.detach()

Returns a view of the original tensor without the autograd history.返回没有 autograd 历史的原始张量的视图。 To be used if you want to manipulate the values of a tensor (not in place) without affecting the computational graph (eg reporting values midway through the forward pass).如果您想在不影响计算图的情况下操纵张量的值(未到位)(例如,在前向传递中途报告值),则使用该值。

Returns a new Tensor, detached from the current graph.返回一个新的张量,与当前图表分离。

The result will never require gradient.结果永远不需要渐变。

This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.此方法也会影响正向模式 AD 梯度,结果将永远不会有正向模式 AD 梯度。

NOTE: Returned Tensor shares the same storage with the original one.注意:返回的张量与原始张量共享相同的存储。 In-place modifications on either of them will be seen, and may trigger errors in correctness checks.将看到对其中任何一个的就地修改,并且可能会触发正确性检查中的错误。 1 1

copy.deepcopy

deepcopy is a generic python function from the copy library which makes a copy of an existing object (recursively if the object itself contains objects). deepcopy是来自copy库的通用 python function 复制现有 object (递归如果 ZA86Z66DE6331C4B666 对象)。

This is used (as opposed to more usual assignment) when the underlying object you wish to make a copy of is mutable (or contains mutables) and would be susceptible to mirroring changes made in one:当您希望复制的底层 object 是可变的(或包含可变的)并且容易受到镜像更改的影响时,使用此方法(而不是更常见的分配):

Assignment statements in Python do not copy objects, they create bindings between a target and an object. Python 中的赋值语句不复制对象,它们在目标和 object 之间创建绑定。 For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.对于可变或包含可变项的 collections,有时需要一个副本,以便可以更改一个副本而不更改另一个副本。

In a PyTorch setting, as you say, if you want a fresh copy of a tensor object to use in a completely different setting with no relationship or effect on its parent, you should use .detach().clone() .正如您所说,在 PyTorch 设置中,如果您想要张量 object 的新副本在完全不同的设置中使用,而对其父级没有关系或影响,您应该使用.detach().clone()


  1. IMPORTANT NOTE: Previously, in-place size / stride / storage changes (such as resize_ / resize_as_ / set_ / transpose_ ) to the returned tensor also update the original tensor.重要说明:以前,对返回张量的就地大小/步幅/存储更改(例如resize_ / resize_as_ / set_ / transpose_ )也会更新原始张量。 Now, these in-place changes will not update the original tensor anymore, and will instead trigger an error.现在,这些就地更改将不再更新原始张量,而是会触发错误。 For sparse tensors: In-place indices / values changes (such as zero_ / copy_ / add_ ) to the returned tensor will not update the original tensor anymore, and will instead trigger an error.对于稀疏张量:返回张量的就地索引/值更改(例如zero_ / copy_ / add_ )将不再更新原始张量,而是会触发错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM