Pytorch DL model，在训练期间随着收敛损失正常更新，但所有数据的 OUTPUT 值相同（回归）

Question

我只在 CNN 和 RNN 中使用过神经网络，但这是我第一次将它用作回归任务。

有30000组数据。 每个数据有 50 个输入特征，我必须为每个预测 14 个 output 特征。

所以，我的目标是对大约 30000 个数据集进行预测，这样我的任务就完成了： IN - 30000 个数据 X 50 个特征 -> OUT - 30000 个预测 X 14 个特征

这些是我的超参数：

input_size = 50
hidden_size = 40
num_epochs = 7
learning_rate =1.00E-03
output_size=15
batch_size=30

我的代码运行良好，损失在每次迭代/时期都会收敛。

但是，由于某种原因，我注意到我的 output 预计30000 预测（行）X 14 特征（列）返回相同的 1 张量 X 14 特征 X 30000 次。

像这样：

[[  1.3311,   1.0411,   0.9971,  13.6349,  31.4082,  16.5008,   3.2034,
         -26.2985, -26.3108, -22.4322,  24.3007, -26.2376, -26.2337, -26.2369],
        [  1.3311,   1.0411,   0.9971,  13.6349,  31.4082,  16.5008,   3.2034,
         -26.2985, -26.3108, -22.4322,  24.3007, -26.2376, -26.2337, -26.2369],
        [  1.3311,   1.0411,   0.9971,  13.6349,  31.4082,  16.5008,   3.2034,
         -26.2985, -26.3108, -22.4322,  24.3007, -26.2376, -26.2337, -26.2369]]

想象一下这个结果，只有 3000 行。 （我什至不明白我是如何达到收敛损失的）所有数据的相同 output 值的示例

我试图追踪这个问题是从哪里开始的，而且它似乎也在训练期间发生了。


    # 5. Training loop
    n_total_steps = len(DS)
    n_iterations = -(-n_total_steps // batch_size) # ceiling division
    training_loss=[]

    loss_fn = nn.MSELoss()

    trainloader = torch.utils.data.DataLoader(
                          DS, 
                          batch_size=batch_size, shuffle = True) 
    testloader = torch.utils.data.DataLoader(
                          TS,
                          batch_size=batch_size)
      
    for epoch in range(num_epochs):
        print('\n')

        for i, (data, target) in enumerate(trainloader): 
            data, target = data.to(device), target.to(device)
            outputs = model(data)
            loss = torch.sqrt(loss_fn(outputs, target))
            training_loss.append(loss.item())

        # 5.5 Backward pass
            opt.zero_grad() # 5.6 Empty the values in the gradient attribute, or model.zero_grad()
            loss.backward() # 5.7 Backprop
            opt.step() # 5.8 Update params

        # 5.9 Print loss
            if (i+1) % 100 == 0:
                print(f'Epoch {epoch+1}/{num_epochs}, Iteration {i+1}/{n_iterations}, Loss={loss.item():.4f} ')


    Epoch 1/7, Iteration 100/1321, Loss=1.5157 
    tensor([[  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
             ....
Epoch 1/7, Iteration 300/1321, Loss=0.9697 
tensor([[  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],

详细说明：我的预测模型的 output 为每个不同的 3000 个输入数据返回相同的值。 它也完全更新它们，而不是单独更新每一行！

我不明白它是如何达到低损耗和收敛的。

NN 代码：

class MyDataset(Dataset) :

      def __init__(self, file_name) :
          train_df = pd.read_csv(file_name)
          x = train_df.filter(regex='X') # Input : X Featrue
          y = train_df.filter(regex='Y') # Output : Y Feature
          self.train_x = torch.tensor(x.values,dtype=torch.float32)
          self.train_y = torch.tensor(y.values,dtype=torch.float32)
  
      def __len__(self) :
          return len(self.train_y)
  
      def __getitem__ (self,idx) :
          return self.train_x[idx],self.train_y[idx]
    
class LGNN(nn.Module):
      def __init__(self, input_size, hidden_size, output_size):
          super().__init__()
          self.layer1 = nn.Linear(input_size, hidden_size)
          self.relu = nn.Tanh()
          self.layer2 = nn.Linear(hidden_size, hidden_size)
          self.layer3 = nn.Linear(hidden_size, hidden_size)
          self.layer4 = nn.Linear(hidden_size, output_size)

      def forward(self, x):
          out = self.layer1(x)
          out = self.relu(out)
          out = self.layer2(out)
          out = self.relu(out)
          out = self.layer3(out)
          out = self.relu(out)
          out = self.layer4(out)
          return out

      # 4.1 Create NN model instance
      model = LGNN(input_size, hidden_size, output_size).to(device) #to(device)는 GPU
      model.apply(reset_weights)
      # 4.2 Loss and Optimiser
      opt = optim.Adam(model.parameters(), lr=learning_rate)
      loss_fn = nn.MSELoss()

此外，model 已通过 k 折和验证。 没有发现过拟合或其他问题：（

Answer 1

你为什么不改变你的学习率？ 看看这篇文章： https://discuss.pytorch.org/t/why-am-i-getting-same-output-values-for-every-single-data-in-my-ann-model-for-multi -class-classification/57760/7

否则检查您的 model 层的权重是否在训练期间更新或输入数据是否正确输入到 model

Pytorch DL model，在训练期间随着收敛损失正常更新，但所有数据的 OUTPUT 值相同（回归）

问题描述

1 个解决方案

解决方案1
0 2022-08-07 14:53:08

Pytorch DL model，在训练期间随着收敛损失正常更新，但所有数据的 OUTPUT 值相同（回归）

问题描述

1 个解决方案

解决方案1 0 2022-08-07 14:53:08

解决方案1
0 2022-08-07 14:53:08