[英]Taking a derivative through torch.ge, or how to explicitly define a derivative in pytorch
I am trying to set up a network in which one layer maps from real numbers to {0, 1} (ie makes output binary).我正在尝试建立一个网络,其中一层从实数映射到 {0, 1}(即使 output 二进制)。
While I was able to find that torch.ge
provides such functionality, whenever I want to train any parameter occurring before that layer in a network PyTorch breaks.虽然我能够发现
torch.ge
提供了这样的功能,但每当我想训练在网络 PyTorch 中断之前发生的任何参数时。
I have been also trying to find if there is any way in PyTorch/autograd, to override the derivative of a module by hand.我也一直在尝试寻找 PyTorch/autograd 中是否有任何方法可以手动覆盖模块的导数。 More specifically in this cause, I would just like to pass derivative through the torch.ge, without changing it.
更具体地说,在这个原因中,我只想通过 torch.ge 传递导数,而不改变它。
Here is a minimal example I produced, which uses a typical neural network training structure in PyTorch.这是我制作的一个最小示例,它使用 PyTorch 中的典型神经网络训练结构。
import torch
import torch.nn as nn
import torch.optim as optim
class LinearGE(nn.Module):
def __init__(self, features_in, features_out):
super().__init__()
self.fc = nn.Linear(features_in, features_out)
def forward(self, x):
return torch.ge(self.fc(x), 0)
x = torch.randn(size=(10, 30))
y = torch.randint(2, size=(10, 10))
# Define Model
m1 = LinearGE(30, 10)
opt = optim.SGD(m1.parameters(), lr=0.01)
crit = nn.MSELoss()
# Train Model
for x_batch, y_batch in zip(x, y):
# zero the parameter gradients
opt.zero_grad()
# forward + backward + optimize
pred = m1(x_batch)
loss = crit(pred.float(), y_batch.float())
loss.backward()
opt.step()
When I run the above code the following error occurs:当我运行上面的代码时,会发生以下错误:
File "__minimal.py", line 33, in <module>
loss.backward()
...
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
This error makes sense since torch.ge
function is not differentiable.这个错误是有道理的,因为
torch.ge
function 是不可微的。 However, since MaxPool2D
is also not differentiable, I believe that there are ways of mitigating non-differentiability in PyTorch.但是,由于
MaxPool2D
也是不可微分的,我相信在 PyTorch 中存在减轻不可微分的方法。
It would be great if someone could point me to any source which can help me either implement my own backprop for a custom module, or any way of avoiding this error message.如果有人可以向我指出任何可以帮助我为自定义模块实现自己的反向传播或任何避免此错误消息的方式的来源,那就太好了。
Thanks!谢谢!
Two things I noticed我注意到的两件事
If your input x is 10x30 (10 examples, 30 features)and the number of output node is 10, then the parameter matrix is 30x10.如果您的输入 x 为 10x30(10 个示例,30 个特征)并且 output 节点的数量为 10,则参数矩阵为 30x10。 The expected output matrix is 10x10 (10 examples 10 output nodes)
预期的 output 矩阵为 10x10(10 个示例 10 个 output 节点)
ge
= greater than and equal to. ge
= 大于等于。 As the code indicated, x >= 0 element wise.如代码所示, x >= 0 元素明智。 We can use relu.
我们可以使用relu。
class LinearGE(nn.Module):
def __init__(self, features_in, features_out):
super().__init__()
self.fc = nn.Linear(features_in, features_out)
self.relu = nn.ReLU(inplace=True)
def forward(self, x):
return self.relu(self.fc(x))
or torch.max
或
torch.max
torch.max(self.fc(x), 0)[0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.