[英]How to transform output of NN, while still being able to train?
I have a neural network which outputs output
.我有一个输出
output
的神经网络。 I want to transform output
before the loss and backpropogation happen.我想在损失和反向传播发生之前转换
output
。
Here is my general code:这是我的一般代码:
with torch.set_grad_enabled(training):
outputs = net(x_batch[:, 0], x_batch[:, 1]) # the prediction of the NN
# My issue is here:
outputs = transform_torch(outputs)
loss = my_loss(outputs, y_batch)
if training:
scheduler.step()
loss.backward()
optimizer.step()
Following the advice in How to transform output of neural network and still train?遵循如何转换神经网络的 output 中的建议并继续训练? , I have a transformation function which I put my output through:
,我有一个转换 function 我把我的 output 通过:
def transform_torch(predictions):
new_tensor = []
for i in range(int(len(predictions))):
arr = predictions[i]
a = arr.clone().detach()
# My transformation, which results in a positive first element, and the other elements represent decrements of the first positive element.
b = torch.negative(a)
b[0] = abs(b[0])
new_tensor.append(torch.cumsum(b, dim = 0))
# new_tensor[i].requires_grad = True
new_tensor = torch.stack(new_tensor, 0)
return new_tensor
Note: In addition to clone().detach()
, I also tried the methods described in Pytorch preferred way to copy a tensor , to similar result.注意:除了
clone().detach()
之外,我还尝试了Pytorch 首选方法中描述的方法来复制张量,得到类似的结果。
My problem is that no training actually happens with this tensor that is tranformed.我的问题是,这个被转换的张量实际上并没有发生任何训练。
If I try to modify the tensor in-place (eg directly modify arr
), then Torch complains that I can't modify a tensor in-place with a gradient attached to it.如果我尝试就地修改张量(例如直接修改
arr
),那么 Torch 会抱怨我无法就地修改带有渐变的张量。
Any suggestions?有什么建议么?
Calling detach
on your predictions
stops gradient propagation to your model.对
predictions
调用detach
会停止向 model 的梯度传播。 Nothing you do after that can change your parameters.之后您所做的任何事情都不会改变您的参数。
How about modifying your code to avoid this:如何修改代码以避免这种情况:
def transform_torch(predictions):
b = torch.cat([predictions[:, :1, ...].abs(), -predictions[:, 1:, ...]], dim=1)
new_tensor = torch.cumsum(b, dim=1)
return new_tensor
How about extracting grad from the tensor with something like this用这样的东西从张量中提取 grad 怎么样
grad = output.grad
and after the transformation assigning the same gradient to new tensor并在转换后将相同的梯度分配给新张量
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.