Pytorch DL model，在訓練期間隨着收斂損失正常更新，但所有數據的 OUTPUT 值相同（回歸）

Question

我只在 CNN 和 RNN 中使用過神經網絡，但這是我第一次將它用作回歸任務。

有30000組數據。 每個數據有 50 個輸入特征，我必須為每個預測 14 個 output 特征。

所以，我的目標是對大約 30000 個數據集進行預測，這樣我的任務就完成了： IN - 30000 個數據 X 50 個特征 -> OUT - 30000 個預測 X 14 個特征

這些是我的超參數：

input_size = 50
hidden_size = 40
num_epochs = 7
learning_rate =1.00E-03
output_size=15
batch_size=30

我的代碼運行良好，損失在每次迭代/時期都會收斂。

但是，由於某種原因，我注意到我的 output 預計30000 預測（行）X 14 特征（列）返回相同的 1 張量 X 14 特征 X 30000 次。

像這樣：

[[  1.3311,   1.0411,   0.9971,  13.6349,  31.4082,  16.5008,   3.2034,
         -26.2985, -26.3108, -22.4322,  24.3007, -26.2376, -26.2337, -26.2369],
        [  1.3311,   1.0411,   0.9971,  13.6349,  31.4082,  16.5008,   3.2034,
         -26.2985, -26.3108, -22.4322,  24.3007, -26.2376, -26.2337, -26.2369],
        [  1.3311,   1.0411,   0.9971,  13.6349,  31.4082,  16.5008,   3.2034,
         -26.2985, -26.3108, -22.4322,  24.3007, -26.2376, -26.2337, -26.2369]]

想象一下這個結果，只有 3000 行。 （我什至不明白我是如何達到收斂損失的）所有數據的相同 output 值的示例

我試圖追蹤這個問題是從哪里開始的，而且它似乎也在訓練期間發生了。


    # 5. Training loop
    n_total_steps = len(DS)
    n_iterations = -(-n_total_steps // batch_size) # ceiling division
    training_loss=[]

    loss_fn = nn.MSELoss()

    trainloader = torch.utils.data.DataLoader(
                          DS, 
                          batch_size=batch_size, shuffle = True) 
    testloader = torch.utils.data.DataLoader(
                          TS,
                          batch_size=batch_size)
      
    for epoch in range(num_epochs):
        print('\n')

        for i, (data, target) in enumerate(trainloader): 
            data, target = data.to(device), target.to(device)
            outputs = model(data)
            loss = torch.sqrt(loss_fn(outputs, target))
            training_loss.append(loss.item())

        # 5.5 Backward pass
            opt.zero_grad() # 5.6 Empty the values in the gradient attribute, or model.zero_grad()
            loss.backward() # 5.7 Backprop
            opt.step() # 5.8 Update params

        # 5.9 Print loss
            if (i+1) % 100 == 0:
                print(f'Epoch {epoch+1}/{num_epochs}, Iteration {i+1}/{n_iterations}, Loss={loss.item():.4f} ')


    Epoch 1/7, Iteration 100/1321, Loss=1.5157 
    tensor([[  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
            [  1.4186,   1.1157,   1.0471,  13.5818,  31.3844,  16.5334,   3.1015,
             -26.3141, -26.2974, -22.4117,  24.3678, -26.2477, -26.2577, -26.2387],
             ....
Epoch 1/7, Iteration 300/1321, Loss=0.9697 
tensor([[  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],
            [  1.3142,   1.0427,   0.9661,  13.6267,  31.2973,  16.5265,   3.1028,
             -26.2207, -26.2468, -22.3516,  24.4410, -26.1698, -26.1708, -26.1715],

詳細說明：我的預測模型的 output 為每個不同的 3000 個輸入數據返回相同的值。 它也完全更新它們，而不是單獨更新每一行！

我不明白它是如何達到低損耗和收斂的。

NN 代碼：

class MyDataset(Dataset) :

      def __init__(self, file_name) :
          train_df = pd.read_csv(file_name)
          x = train_df.filter(regex='X') # Input : X Featrue
          y = train_df.filter(regex='Y') # Output : Y Feature
          self.train_x = torch.tensor(x.values,dtype=torch.float32)
          self.train_y = torch.tensor(y.values,dtype=torch.float32)
  
      def __len__(self) :
          return len(self.train_y)
  
      def __getitem__ (self,idx) :
          return self.train_x[idx],self.train_y[idx]
    
class LGNN(nn.Module):
      def __init__(self, input_size, hidden_size, output_size):
          super().__init__()
          self.layer1 = nn.Linear(input_size, hidden_size)
          self.relu = nn.Tanh()
          self.layer2 = nn.Linear(hidden_size, hidden_size)
          self.layer3 = nn.Linear(hidden_size, hidden_size)
          self.layer4 = nn.Linear(hidden_size, output_size)

      def forward(self, x):
          out = self.layer1(x)
          out = self.relu(out)
          out = self.layer2(out)
          out = self.relu(out)
          out = self.layer3(out)
          out = self.relu(out)
          out = self.layer4(out)
          return out

      # 4.1 Create NN model instance
      model = LGNN(input_size, hidden_size, output_size).to(device) #to(device)는 GPU
      model.apply(reset_weights)
      # 4.2 Loss and Optimiser
      opt = optim.Adam(model.parameters(), lr=learning_rate)
      loss_fn = nn.MSELoss()

此外，model 已通過 k 折和驗證。 沒有發現過擬合或其他問題：（

Answer 1

你為什么不改變你的學習率？ 看看這篇文章： https://discuss.pytorch.org/t/why-am-i-getting-same-output-values-for-every-single-data-in-my-ann-model-for-multi -class-classification/57760/7

否則檢查您的 model 層的權重是否在訓練期間更新或輸入數據是否正確輸入到 model

Pytorch DL model，在訓練期間隨着收斂損失正常更新，但所有數據的 OUTPUT 值相同（回歸）

問題描述

1 個解決方案

解決方案1
0 2022-08-07 14:53:08

Pytorch DL model，在訓練期間隨着收斂損失正常更新，但所有數據的 OUTPUT 值相同（回歸）

問題描述

1 個解決方案

解決方案1 0 2022-08-07 14:53:08

解決方案1
0 2022-08-07 14:53:08