简体   繁体   中英

val loss not decrease

I am using Keras 2.07, with Python 3.5, Tensorflow 1.3.0 on Windows 10

I am testing the architecture used in paper intra prediction using fully connected network for video coding.

I hope to use it for my own data.

I used test data which I thought would converge very quickly.

The learning rate I using in my model is 0.1. Then my val loss not decrease.

Can someone look or try this code? Am I assuming wrong, coding wrong or impatient? Thanks

def multi_input_model():
   input1_ = Input(shape=(4,20,1), name='input1')
   input2_ = Input(shape=(16,4,1), name='input2')

   x1 = Flatten()(input1_)
   x2 = Flatten()(input2_)
   x = Concatenate()([x1, x2])

   for i in range(3):
      #x = Dropout(0.3)(x)
      x = Dense(1024,
                kernel_initializer = RandomNormal(stddev=std),
                use_bias=True,
                bias_initializer = 'zeros', 
                activation='relu'
                )(x)

      #x = Dropout(0.3)(x)
      x = Dense(64,
                kernel_initializer = RandomNormal(stddev=std),
                use_bias=True,
                bias_initializer = 'zeros',
                )(x)

   output_ = Reshape((8,8,1), name='output')(x)

   model = Model(inputs=[input1_, input2_], outputs=[output_])
   model.summary()

return model


sgd = SGD(lr = 0.1, momentum=0.9)

model = multi_input_model()
model.compile(optimizer=sgd,
              loss='mean_squared_error',
              metrics=[mse]
              )

Model Summary:

Using TensorFlow backend.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input1 (InputLayer)             (None, 4, 20, 1)     0                                            
__________________________________________________________________________________________________
input2 (InputLayer)             (None, 16, 4, 1)     0                                            
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 80)           0           input1[0][0]                     
__________________________________________________________________________________________________
flatten_2 (Flatten)             (None, 64)           0           input2[0][0]                     
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 144)          0           flatten_1[0][0]                  
                                                                 flatten_2[0][0]                  
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1024)         148480      concatenate_1[0][0]              
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 1024)         1049600     dense_1[0][0]                    
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 1024)         1049600     dense_2[0][0]                    
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 64)           65600       dense_3[0][0]                    
__________________________________________________________________________________________________
output (Reshape)                (None, 8, 8, 1)      0           dense_4[0][0]                    
==================================================================================================
Total params: 2,313,280
Trainable params: 2,313,280
Non-trainable params: 0
__________________________________________________________________________________________________

Training Traceback

Train on 985373 samples, validate on 246344 samples
Epoch 1/100
985373/985373 [==============================] - 7s 7us/step - loss: 0.0054 - mse: 353.9386 - val_loss: 0.0087 - val_mse: 566.5364
Lr :  0.08
Epoch 2/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0044 - mse: 288.5897 - val_loss: 0.0082 - val_mse: 534.4153
Lr :  0.08
Epoch 3/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0042 - mse: 270.7345 - val_loss: 0.0080 - val_mse: 517.2601
Lr :  0.08
Epoch 4/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0040 - mse: 259.6213 - val_loss: 0.0078 - val_mse: 504.8340
Lr :  0.08
Epoch 5/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0039 - mse: 251.3669 - val_loss: 0.0076 - val_mse: 495.0704
Lr :  0.08
Epoch 6/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0038 - mse: 244.7449 - val_loss: 0.0075 - val_mse: 486.6413
Lr :  0.08
Epoch 7/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0037 - mse: 239.2320 - val_loss: 0.0074 - val_mse: 480.2631
Lr :  0.08
Epoch 8/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0036 - mse: 234.5702 - val_loss: 0.0073 - val_mse: 473.3974
Lr :  0.08
Epoch 9/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0035 - mse: 230.5504 - val_loss: 0.0072 - val_mse: 468.4981
Lr :  0.08
Epoch 10/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0035 - mse: 227.0740 - val_loss: 0.0071 - val_mse: 463.1125
Lr :  0.08
Epoch 11/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0034 - mse: 224.0050 - val_loss: 0.0071 - val_mse: 459.0103
Lr :  0.08
Epoch 12/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0034 - mse: 221.3146 - val_loss: 0.0070 - val_mse: 455.0988
Lr :  0.08
Epoch 13/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0034 - mse: 218.9012 - val_loss: 0.0069 - val_mse: 451.7273
Lr :  0.08
Epoch 14/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0033 - mse: 216.6984 - val_loss: 0.0069 - val_mse: 448.4461
Lr :  0.08
Epoch 15/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0033 - mse: 214.7047 - val_loss: 0.0069 - val_mse: 446.2414
Lr :  0.08
Epoch 16/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0033 - mse: 212.8751 - val_loss: 0.0068 - val_mse: 443.6305
Lr :  0.08
Epoch 17/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0032 - mse: 211.1775 - val_loss: 0.0068 - val_mse: 441.6251
Lr :  0.08
Epoch 18/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0032 - mse: 209.5941 - val_loss: 0.0068 - val_mse: 439.3194
Lr :  0.08
Epoch 19/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0032 - mse: 208.1229 - val_loss: 0.0067 - val_mse: 436.9524
Lr :  0.08
Epoch 20/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0032 - mse: 206.7133 - val_loss: 0.0067 - val_mse: 435.1670
Lr :  0.08
Epoch 21/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0032 - mse: 205.4114 - val_loss: 0.0067 - val_mse: 432.8971
Lr :  0.08
Epoch 22/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0031 - mse: 204.1449 - val_loss: 0.0066 - val_mse: 431.2357
Lr :  0.08
Epoch 23/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0031 - mse: 202.9737 - val_loss: 0.0066 - val_mse: 430.8359
Lr :  0.08
Epoch 24/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0031 - mse: 201.8537 - val_loss: 0.0066 - val_mse: 429.2760
Lr :  0.08
Epoch 25/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0031 - mse: 200.7699 - val_loss: 0.0066 - val_mse: 428.2598
Lr :  0.08
Epoch 26/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0031 - mse: 199.7556 - val_loss: 0.0066 - val_mse: 426.2759
Lr :  0.08
Epoch 27/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0031 - mse: 198.7569 - val_loss: 0.0065 - val_mse: 425.0138
Lr :  0.08
Epoch 28/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 197.8049 - val_loss: 0.0065 - val_mse: 423.5451
Lr :  0.08
Epoch 29/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 196.8937 - val_loss: 0.0065 - val_mse: 422.2164
Lr :  0.08
Epoch 30/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 196.0350 - val_loss: 0.0065 - val_mse: 421.1507
Lr :  0.08
Epoch 31/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 195.1807 - val_loss: 0.0065 - val_mse: 420.2791
Lr :  0.08
Epoch 32/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 194.3681 - val_loss: 0.0065 - val_mse: 420.0271
Lr :  0.08
Epoch 33/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 193.5861 - val_loss: 0.0064 - val_mse: 418.1573
Lr :  0.08
Epoch 34/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 192.8048 - val_loss: 0.0064 - val_mse: 417.7967
Lr :  0.08
Epoch 35/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0030 - mse: 192.0708 - val_loss: 0.0064 - val_mse: 416.3381
Lr :  0.08
Epoch 36/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 191.3404 - val_loss: 0.0064 - val_mse: 416.3695
Lr :  0.08
Epoch 37/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 190.6411 - val_loss: 0.0064 - val_mse: 415.9791
Lr :  0.08
Epoch 38/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 189.9464 - val_loss: 0.0064 - val_mse: 414.0931
Lr :  0.08
Epoch 39/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 189.2725 - val_loss: 0.0064 - val_mse: 413.8717
Lr :  0.08
Epoch 40/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 188.6250 - val_loss: 0.0064 - val_mse: 413.0042
Lr :  0.08
Epoch 41/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 187.9844 - val_loss: 0.0064 - val_mse: 413.0950
Lr :  0.08
Epoch 42/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 187.3532 - val_loss: 0.0063 - val_mse: 412.4408
Lr :  0.08
Epoch 43/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 186.7512 - val_loss: 0.0063 - val_mse: 411.1885
Lr :  0.08
Epoch 44/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 186.1417 - val_loss: 0.0063 - val_mse: 410.7527
Lr :  0.08
Epoch 45/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0029 - mse: 185.5806 - val_loss: 0.0063 - val_mse: 409.3184
Lr :  0.08
Epoch 46/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 185.0026 - val_loss: 0.0063 - val_mse: 410.3592
Lr :  0.08
Epoch 47/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 184.4485 - val_loss: 0.0063 - val_mse: 409.0613
Lr :  0.08
Epoch 48/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 183.8902 - val_loss: 0.0063 - val_mse: 409.2569
Lr :  0.08
Epoch 49/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 183.3491 - val_loss: 0.0063 - val_mse: 408.1287
Lr :  0.08
Epoch 50/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 182.8373 - val_loss: 0.0063 - val_mse: 407.0794
Lr :  0.08
Epoch 51/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 182.3068 - val_loss: 0.0063 - val_mse: 407.4011
Lr :  0.08
Epoch 52/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 181.7958 - val_loss: 0.0062 - val_mse: 405.9963
Lr :  0.08
Epoch 53/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 181.2954 - val_loss: 0.0062 - val_mse: 406.3628
Lr :  0.08
Epoch 54/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 180.8150 - val_loss: 0.0062 - val_mse: 405.7962
Lr :  0.08
Epoch 55/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 180.3107 - val_loss: 0.0062 - val_mse: 405.0873
Lr :  0.08
Epoch 56/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 179.8412 - val_loss: 0.0062 - val_mse: 405.1387
Lr :  0.08
Epoch 57/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 179.3717 - val_loss: 0.0062 - val_mse: 404.4157
Lr :  0.08
Epoch 58/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0028 - mse: 178.9016 - val_loss: 0.0062 - val_mse: 403.5622
Lr :  0.08
Epoch 59/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 178.4647 - val_loss: 0.0062 - val_mse: 403.3207
Lr :  0.08
Epoch 60/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 177.9997 - val_loss: 0.0062 - val_mse: 403.5156
Lr :  0.08
Epoch 61/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 177.5587 - val_loss: 0.0062 - val_mse: 403.2921
Lr :  0.08
Epoch 62/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 177.1332 - val_loss: 0.0062 - val_mse: 402.8525
Lr :  0.08
Epoch 63/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 176.6831 - val_loss: 0.0062 - val_mse: 402.3887
Lr :  0.08
Epoch 64/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 176.2728 - val_loss: 0.0062 - val_mse: 401.6309
Lr :  0.08
Epoch 65/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 175.8403 - val_loss: 0.0062 - val_mse: 401.4650
Lr :  0.08
Epoch 66/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 175.4337 - val_loss: 0.0062 - val_mse: 401.6886
Lr :  0.08
Epoch 67/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 175.0153 - val_loss: 0.0062 - val_mse: 401.1379
Lr :  0.08
Epoch 68/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 174.6212 - val_loss: 0.0062 - val_mse: 401.4452
Lr :  0.08
Epoch 69/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 174.2141 - val_loss: 0.0062 - val_mse: 401.5032
Lr :  0.08
Epoch 70/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 173.8265 - val_loss: 0.0062 - val_mse: 400.2583
Lr :  0.08
Epoch 71/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 173.4480 - val_loss: 0.0062 - val_mse: 399.9907
Lr :  0.08
Epoch 72/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 173.0551 - val_loss: 0.0062 - val_mse: 400.5084
Lr :  0.08
Epoch 73/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0027 - mse: 172.6990 - val_loss: 0.0062 - val_mse: 400.7729
Lr :  0.08
Epoch 74/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 172.3093 - val_loss: 0.0061 - val_mse: 398.8733
Lr :  0.08
Epoch 75/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 171.9405 - val_loss: 0.0061 - val_mse: 399.6324
Lr :  0.08
Epoch 76/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 171.5539 - val_loss: 0.0062 - val_mse: 400.0404
Lr :  0.08
Epoch 77/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 171.2163 - val_loss: 0.0061 - val_mse: 399.0176
Lr :  0.08
Epoch 78/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 170.8428 - val_loss: 0.0061 - val_mse: 398.2502
Lr :  0.08
Epoch 79/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 170.5107 - val_loss: 0.0061 - val_mse: 399.4756
Lr :  0.08
Epoch 80/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 170.1418 - val_loss: 0.0061 - val_mse: 398.6772
Lr :  0.08
Epoch 81/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 169.7998 - val_loss: 0.0061 - val_mse: 398.6140
Lr :  0.08
Epoch 82/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 169.4597 - val_loss: 0.0061 - val_mse: 398.5729
Lr :  0.08
Epoch 83/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 169.1229 - val_loss: 0.0061 - val_mse: 397.8155
Lr :  0.08
Epoch 84/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 168.7856 - val_loss: 0.0061 - val_mse: 397.5096
Lr :  0.08
Epoch 85/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 168.4588 - val_loss: 0.0061 - val_mse: 397.9902
Lr :  0.08
Epoch 86/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 168.1268 - val_loss: 0.0061 - val_mse: 397.3359
Lr :  0.08
Epoch 87/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 167.8185 - val_loss: 0.0061 - val_mse: 397.9455
Lr :  0.08
Epoch 88/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 167.4879 - val_loss: 0.0061 - val_mse: 397.7877
Lr :  0.08
Epoch 89/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 167.1769 - val_loss: 0.0061 - val_mse: 397.3274
Lr :  0.08
Epoch 90/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 166.8585 - val_loss: 0.0061 - val_mse: 397.2146
Lr :  0.08
Epoch 91/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 166.5457 - val_loss: 0.0061 - val_mse: 397.8896
Lr :  0.08
Epoch 92/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 166.2354 - val_loss: 0.0061 - val_mse: 396.7620
Lr :  0.08
Epoch 93/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0026 - mse: 165.9282 - val_loss: 0.0061 - val_mse: 397.0715
Lr :  0.08
Epoch 94/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 165.6314 - val_loss: 0.0061 - val_mse: 397.2557
Lr :  0.08
Epoch 95/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 165.3255 - val_loss: 0.0061 - val_mse: 396.8092
Lr :  0.08
Epoch 96/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 165.0327 - val_loss: 0.0061 - val_mse: 396.3043
Lr :  0.08
Epoch 97/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 164.7327 - val_loss: 0.0061 - val_mse: 397.1202
Lr :  0.08
Epoch 98/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 164.4545 - val_loss: 0.0061 - val_mse: 397.8727
Lr :  0.08
Epoch 99/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 164.1428 - val_loss: 0.0061 - val_mse: 397.3806
Lr :  0.08
Epoch 100/100
985373/985373 [==============================] - 6s 6us/step - loss: 0.0025 - mse: 163.8922 - val_loss: 0.0061 - val_mse: 396.7912
Lr :  0.08

Your code is working correctly. From the loss plots, you can see that the model is converging nicely. Your problem is that there is a large gap between your training loss and your validation loss. This is sometimes called "low bias, high variance".

Solutions include:

  • Regularisation
  • Early stopping
  • Getting more training data
  • Ensuring that your validation data and training data are drawn from the same distribution

There are lots of resources that will give you ideas, such as:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM