[英]Loss is 'nan' all the time when training the neural network in PyTorch
I assigned different weight_decay
for the parameters, and the training loss
and testing loss
were all nan. 我为参数分配了不同的
weight_decay
, training loss
和testing loss
均为nan。
I printed the prediction_train,loss_train,running_loss_train,prediction_test,loss_test,and running_loss_test
,they were all nan. 我打印了
prediction_train,loss_train,running_loss_train,prediction_test,loss_test,and running_loss_test
,它们都是nan。
And I have checked the data with numpy.any(numpy.isnan(dataset))
, it returned False
. 我用
numpy.any(numpy.isnan(dataset))
检查了数据,返回False
。
If I use optimizer = torch.optim.Adam(wnn.parameters())
rather than assigning different weight_decay
for the parameters, there would be no problem. 如果我使用
optimizer = torch.optim.Adam(wnn.parameters())
而不是为参数分配不同的weight_decay
,那就没有问题。
Could you please tell me how to fix it? 你能告诉我怎么解决吗? Here are the codes, I defined the activation function by myself.
这是代码,我自己定义了激活功能。 Thank you:)
谢谢:)
class Morlet(nn.Module):
def __init__(self):
super(Morlet,self).__init__()
def forward(self,x):
x=(torch.cos(1.75*x))*(torch.exp(-0.5*x*x))
return x
morlet=Morlet()
class WNN(nn.Module):
def __init__(self):
super(WNN,self).__init__()
self.a1=torch.nn.Parameter(torch.randn(64,requires_grad=True))
self.b1=torch.nn.Parameter(torch.randn(64,requires_grad=True))
self.layer1=nn.Linear(30,64,bias=False)
self.out=nn.Linear(64,1)
def forward(self,x):
x=self.layer1(x)
x=(x-self.b1)/self.a1
x=morlet(x)
out=self.out(x)
return out
wnn=WNN()
optimizer = torch.optim.Adam([{'params': wnn.layer1.weight, 'weight_decay':0.01},
{'params': wnn.out.weight, 'weight_decay':0.01},
{'params': wnn.out.bias, 'weight_decay':0},
{'params': wnn.a1, 'weight_decay':0.01},
{'params': wnn.b1, 'weight_decay':0.01}])
criterion = nn.MSELoss()
for epoch in range(10):
prediction_test_list=[]
running_loss_train=0
running_loss_test=0
for i,(x1,y1) in enumerate(trainloader):
prediction_train=wnn(x1)
#print(prediction_train)
loss_train=criterion(prediction_train,y1)
#print(loss_train)
optimizer.zero_grad()
loss_train.backward()
optimizer.step()
running_loss_train+=loss_train.item()
#print(running_loss_train)
tr_loss=running_loss_train/train_set_y_array.shape[0]
for i,(x2,y2) in enumerate(testloader):
prediction_test=wnn(x2)
#print(prediction_test)
loss_test=criterion(prediction_test,y2)
#print(loss_test)
running_loss_test+=loss_test.item()
print(running_loss_test)
prediction_test_list.append(prediction_test.detach().cpu())
ts_loss=running_loss_test/test_set_y_array.shape[0]
print('Epoch {} Train Loss:{}, Test Loss:{}'.format(epoch+1,tr_loss,ts_loss))
test_set_y_array_plot=test_set_y_array*(dataset.max()-dataset.min())+dataset.min()
prediction_test_np=torch.cat(prediction_test_list).numpy()
prediction_test_plot=prediction_test_np*(dataset.max()-dataset.min())+dataset.min()
plt.plot(test_set_y_array_plot.flatten(),'r-',linewidth=0.5,label='True data')
plt.plot(prediction_test_plot,'b-',linewidth=0.5,label='Predicted data')
plt.legend()
plt.show()
print('Finish training')
The output was: 输出是:
Epoch 1 Train Loss:nan, Test Loss:nan
And there was only the true data on the plot, as the picture shows. 如图所示,情节中只有真实的数据。
重量衰减将L2正则化应用于学习参数,快速浏览一下你的代码,你在这里使用a1
权重作为语法x=(x-self.b1)/self.a1
,权重衰减为.01,这可能导致消除那些a1
权重中的一些为零,以及除零的结果是什么?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.