简体   繁体   English

Pytorch 中 Sigmoid 函数破坏的梯度计算

[英]Gradient Computation broken by Sigmoid function in Pytorch

Hey I have been struggling with this weird problem.嘿,我一直在努力解决这个奇怪的问题。 Here is my code for the Neural Net:这是我的神经网络代码:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv_3d_=nn.Sequential(
            nn.Conv3d(1,1,9,1,4),
            nn.LeakyReLU(),
            nn.Conv3d(1,1,9,1,4),
            nn.LeakyReLU(),
            nn.Conv3d(1,1,9,1,4),
            nn.LeakyReLU()  
        )

        self.linear_layers_ = nn.Sequential(

            nn.Linear(batch_size*32*32*32,batch_size*32*32*3),
            nn.LeakyReLU(),
            nn.Linear(batch_size*32*32*3,batch_size*32*32*3),
            nn.Sigmoid()
        )

    def forward(self,x,y,z):
        conv_layer = x + y + z
        conv_layer = self.conv_3d_(conv_layer)
        conv_layer = torch.flatten(conv_layer)
        conv_layer = self.linear_layers_(conv_layer)
        conv_layer = conv_layer.view((batch_size,3,input_sizes,input_sizes))
        return conv_layer

The weird problem I am facing is that running this NN gives me an error我面临的奇怪问题是运行这个神经网络给我一个错误

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3072]], which is output 0 of SigmoidBackward, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

The stack trace shows that the issue is in line堆栈跟踪显示问题符合

conv_layer = self.linear_layers_(conv_layer)

However, if I replace the last activation function of my FCN from nn.Sigmoid() to nn.LeakyRelu(), the NN executes properly.但是,如果我将 FCN 的最后一个激活函数从 nn.Sigmoid() 替换为 nn.LeakyRelu(),则 NN 会正确执行。

Can anyone tell me why Sigmoid activation function is causing my backward computation to break?谁能告诉我为什么 Sigmoid 激活函数会导致我的反向计算中断?

I found the problem with my code.我发现我的代码有问题。 I delved deeper into what in-place actually meant.我深入研究了就地实际上意味着什么。 So, if you check the line所以,如果你检查线

conv_layer = self.linear_layers_(conv_layer)

linear_layers_ of the assignment is changing the values of conv_layer in-place and as a result the values are getting overwritten and because of this, gradient computation fails.分配的linear_layers_正在改变就地conv_layer的值,因此该值被覆盖掉了,由于这个原因,梯度计算失败。 Easy solution for this problem is to use the clone() function这个问题的简单解决方案是使用 clone() 函数

ie IE

conv_layer = self.linear_layers_(conv_layer).clone()

This creates a copy of the right hand computation and Autograd is able to store the reference of the computation graph.这会创建右手计算的副本,并且 Autograd 能够存储计算图的引用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 PyTorch 中使用 back() 理解梯度计算 - Understanding gradient computation using backward() in PyTorch 如何将 PyTorch sigmoid function 更改为更陡峭 - How to change PyTorch sigmoid function to be steeper 在pytorch中获取矢量化函数的梯度 - Getting gradient of vectorized function in pytorch 需要帮助了解 pytorch 中的梯度 function - Need help understanding the gradient function in pytorch RuntimeError:梯度计算所需的变量之一已被原位操作修改:PyTorch 错误 - RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: PyTorch error PyTorch:寻找已被就地操作修改的梯度计算所需的变量 - 多任务学习 - PyTorch: Finding variable needed for gradient computation that has been modified by inplace operation - Multitask Learning 在 Pytorch 中将 L1 损失稍微调整为加权 L1 损失,梯度计算是否仍然正常工作? - Slightly adapt L1 loss to a weighted L1 loss in Pytorch, does gradient computation still work properly? PyTorch 梯度计算所需的变量之一已通过就地操作进行了修改 - PyTorch one of the variables needed for gradient computation has been modified by an inplace operation 渴望执行:梯度计算 - Eager execution: gradient computation 通过卷积蒙版进行梯度计算 - Gradient computation by convolution masks
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM