简体   繁体   English

如何计算验证损失? (简单线性回归)

[英]How to compute the validation loss? (Simple linear regression)

I am currently learning how to use PyTorch to build a neural network.我目前正在学习如何使用 PyTorch 来构建神经网络。 I have learned keras before and I would like to do the same thing in PyTorch like 'model.fit' and plotting a graph containing both training loss and validation loss.我以前学过 keras,我想在 PyTorch 中做同样的事情,比如“model.fit”,并绘制一个包含训练损失和验证损失的图表。

In order to know whether the model is underfitting or not, I have to plot a graph to compare the training loss and validation loss.为了知道 model 是否欠拟合,我必须对 plot 绘制图表来比较训练损失和验证损失。

However, I cannot compute the right validation loss.但是,我无法计算正确的验证损失。 I know that gradients should only be updated during training and it should not be updated during testing/validation.我知道梯度应该只在训练期间更新,而不应该在测试/验证期间更新。 With no change in gradients, does it mean the loss will not change?梯度没有变化,是否意味着损失不会改变? Sorry, my concept is not clear enough.对不起,我的概念不够清楚。 But I think not, loss should be computed by comparing expected output and prediction using loss function.但我认为不是,应该通过比较预期的 output 和使用损失 function 的预测来计算损失。

In my code, 80 datasets are used for training and 20 datasets are used for validation.在我的代码中,80 个数据集用于训练,20 个数据集用于验证。 In my code, the neural network is prediction this formula: y =2X^3 + 7X^2 - 8*X + 120 It is easy to compute so I use this for learning how to build neural network through PyTorch.在我的代码中,神经网络预测这个公式: y =2X^3 + 7X^2 - 8*X + 120 它很容易计算,所以我用它来学习如何通过 PyTorch 构建神经网络。

Here is my code:这是我的代码:

import torch
import torch.nn as nn    #neural network model
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torch.autograd import Variable
from sklearn.preprocessing import MinMaxScaler

#Load datasets
dataset = pd.read_csv('test_100.csv')

X = dataset.iloc[:, :-1].values
Y = dataset.iloc[:, -1:].values

X_scaler = MinMaxScaler()
Y_scaler = MinMaxScaler()
print(X_scaler.fit(X))
print(Y_scaler.fit(Y))
X = X_scaler.transform(X)
Y = Y_scaler.transform(Y)

x_temp_train = X[:79]
y_temp_train = Y[:79]
x_temp_test = X[80:]
y_temp_test = Y[80:]

X_train = torch.FloatTensor(x_temp_train)
Y_train = torch.FloatTensor(y_temp_train)
X_test = torch.FloatTensor(x_temp_test)
Y_test = torch.FloatTensor(y_temp_test)

D_in = 1 # D_in is input features
H = 24 # H is hidden dimension
D_out = 1 # D_out is output features.

#Define a Artifical Neural Network model
class Net(nn.Module):
#------------------Two Layers------------------------------
    def __init__(self, D_in, H, D_out):
        super(Net, self).__init__()

        self.linear1 = nn.Linear(D_in, H)  
        self.linear2 = nn.Linear(H, D_out)
        
    def forward(self, x):
        h_relu = self.linear1(x).clamp(min=0)
        prediction = self.linear2(h_relu)
        return prediction
model = Net(D_in, H, D_out)
print(model)

#Define a Loss function and optimizer
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.2) #2e-7

#Training
inputs = Variable(X_train)
outputs = Variable(Y_train)
inputs_val = Variable(X_test)
outputs_val = Variable(Y_test)
loss_values = []
val_values = []
epoch = []
for phase in ['train', 'val']:
    if phase == 'train':
        model.train()  # Set model to training mode
    else:
        optimizer.zero_grad() #zero the parameter gradients
        model.eval()   # Set model to evaluate mode
    for i in range(50):      #epoch=50
        if phase == 'train':
            model.train()
            prediction = model(inputs)
            loss = criterion(prediction, outputs) 
            print('train loss')
            print(loss)
            loss_values.append(loss.detach())
            optimizer.zero_grad() #zero the parameter gradients
            epoch.append(i)
            loss.backward()       #compute gradients(dloss/dx)
            optimizer.step()      #updates the parameters
        elif phase == 'val':
            model.eval()
            prediction_val = model(inputs_val)
            loss_val = criterion(prediction_val, outputs_val) 
            print('validation loss')
            print(loss)
            val_values.append(loss_val.detach())
            optimizer.zero_grad() #zero the parameter gradients
          
plt.plot(epoch,loss_values)
plt.plot(epoch, val_values)
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train','validation'], loc='upper left')
plt.show()

Here is the result:结果如下:

train loss
tensor(0.9788, grad_fn=<MseLossBackward>)
tensor(2.0834, grad_fn=<MseLossBackward>)
tensor(3.2902, grad_fn=<MseLossBackward>)
tensor(0.8851, grad_fn=<MseLossBackward>)
tensor(0.0832, grad_fn=<MseLossBackward>)
tensor(0.0402, grad_fn=<MseLossBackward>)
tensor(0.0323, grad_fn=<MseLossBackward>)
tensor(0.0263, grad_fn=<MseLossBackward>)
tensor(0.0217, grad_fn=<MseLossBackward>)
tensor(0.0181, grad_fn=<MseLossBackward>)
tensor(0.0153, grad_fn=<MseLossBackward>)
tensor(0.0132, grad_fn=<MseLossBackward>)
tensor(0.0116, grad_fn=<MseLossBackward>)
tensor(0.0103, grad_fn=<MseLossBackward>)
tensor(0.0094, grad_fn=<MseLossBackward>)
tensor(0.0087, grad_fn=<MseLossBackward>)
tensor(0.0081, grad_fn=<MseLossBackward>)
tensor(0.0077, grad_fn=<MseLossBackward>)
tensor(0.0074, grad_fn=<MseLossBackward>)
tensor(0.0072, grad_fn=<MseLossBackward>)
tensor(0.0070, grad_fn=<MseLossBackward>)
tensor(0.0068, grad_fn=<MseLossBackward>)
tensor(0.0067, grad_fn=<MseLossBackward>)
tensor(0.0067, grad_fn=<MseLossBackward>)
tensor(0.0066, grad_fn=<MseLossBackward>)
tensor(0.0065, grad_fn=<MseLossBackward>)
tensor(0.0065, grad_fn=<MseLossBackward>)
tensor(0.0065, grad_fn=<MseLossBackward>)
tensor(0.0064, grad_fn=<MseLossBackward>)
tensor(0.0064, grad_fn=<MseLossBackward>)
tensor(0.0064, grad_fn=<MseLossBackward>)
tensor(0.0064, grad_fn=<MseLossBackward>)
tensor(0.0063, grad_fn=<MseLossBackward>)
tensor(0.0063, grad_fn=<MseLossBackward>)
tensor(0.0063, grad_fn=<MseLossBackward>)
tensor(0.0063, grad_fn=<MseLossBackward>)
tensor(0.0063, grad_fn=<MseLossBackward>)
tensor(0.0062, grad_fn=<MseLossBackward>)
tensor(0.0062, grad_fn=<MseLossBackward>)
tensor(0.0062, grad_fn=<MseLossBackward>)
tensor(0.0062, grad_fn=<MseLossBackward>)
tensor(0.0062, grad_fn=<MseLossBackward>)
tensor(0.0062, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)

validation loss
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)
tensor(0.0061, grad_fn=<MseLossBackward>)

Train Loss Vs.火车损失与。 Validation Loss验证损失

The validation loss is a flat line.验证损失是一条平线。 It is not what I want.这不是我想要的。

You should do validation after each training epoch to "validate" your model capabilities.您应该在每个训练时期之后进行验证,以“验证”您的 model 功能。 Also, in validation phase the model parameters don't change so it is obvious you would get a constant loss in your validations.此外,在验证阶段,model 参数不会改变,因此很明显您会在验证中不断丢失。 Your code should be as follows:您的代码应如下所示:

training epoch 1训练时期 1

validation验证

training epoch 2训练时期 2

validation验证

... ...

don't forget to use loss.item() instead of loss alone in your loss calculating and averaging.不要忘记在损失计算和平均时使用 loss.item() 而不是单独使用 loss。 Because loss gives you a grad_function, not a float value.因为 loss 给你一个 grad_function,而不是一个浮点值。

The code that you have written is first training the model for the entire data set, and then once the model is trained, it is calculating validation loss.您编写的代码首先为整个数据集训练 model,然后一旦训练 model,它就会计算验证损失。 Since now the model is fixed you will see a flat line, no change is validation loss.由于现在 model 已修复,您将看到一条平线,没有变化是验证损失。 What you need to do is change the order of your for loops, for each epoch first train and then go for validation.您需要做的是更改 for 循环的顺序,对于每个 epoch 首先训练,然后 go 进行验证。 Something like this:像这样的东西:

for i in range(50):
    for phase in ['train', 'val']:
        if phase == 'train':
            model.train()
            prediction = model(inputs)
            loss = criterion(prediction, outputs) 
            print('train loss')
            print(loss)
            loss_values.append(loss.detach())
            optimizer.zero_grad() #zero the parameter gradients
            epoch.append(i)
            loss.backward()       #compute gradients(dloss/dx)
            optimizer.step()      #updates the parameters
        elif phase == 'val':
            model.eval()
            prediction_val = model(inputs_val)
            loss_val = criterion(prediction_val, outputs_val) 
            print('validation loss')
            print(loss)
            val_values.append(loss_val.detach())
            optimizer.zero_grad() #zero the parameter gradients

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM