简体   繁体   中英

Why is my val_loss going down but my val_accuracy stagnates

So I train my model with a dataset and for each epoch I can see the loss and val_loss go down (it is important to note that val_loss will go down to a certain point, but then it stagnates as well, having some minor ups and downs) and accuracy go up but for some reason my val_accuracy stays at roughly 0.33. I browsed this and it seems to be a problem of overfitting so i added Dropout layers and regularization by using l2 on some layers of the model but it seems to have no effect. Therefore I would like to ask you what do you think I could improve in my model in order to make the val_loss keep going down and my val_accuracy not stagnate and therefore keep going up.

I've tried to use more images but the problem seems to be the same.. Not sure if my increment of images was enough tho.

Should I add Dropout layers in the Conv2D layers? Should I use less or more l2 regularization? Should I use even more images? Just some questions that might have something to do with my problem.

My model is below:

model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), input_shape=(580, 360, 1), padding='same', activation='relu'))
model.add(BatchNormalization())

model.add(Conv2D(16, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Flatten())  # Flattening the 2D arrays for fully connected layers

model.add(Dense(532, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(266, activation='softmax'))
model.add(Reshape((7, 38)))

print(model.summary())
optimizer = keras.optimizers.SGD(lr=0.00001)
model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])

Thanks in advance!

PS: Here is the graph of training:

在此处输入图像描述

PS2: Here is the end of training:

209/209 [==============================] - 68s 327ms/step - loss: 0.7421 - accuracy: 0.9160 - val_loss: 3.8159 - val_accuracy: 0.3152
Epoch 40/40

This seems to be a classic overfitting problem.

It would be nice to have a more detailed introduction to the problem, like is it a classification task? Are your images grayscale? What is the purpose of this network?

With this information I would say that any proper regularization to the network should help. Some item you could try:

  • For conv layers I recommend using SpatialDropout layers .
  • Get more data (if possible)
  • Use data augmentation (if possible)
  • Increase the rate of the dropout layers
  • Try reducing the complexity of your model architecture (maybe fewer layers, fewer number of filters in general, fewer number of neurons in dense layers, etc.)

Hope this helps!

Just a hint:

You have a problem with your CNN architecture, the size must be lower and lower at each convolution, but in your case it is growing: you have 16, 32, 64, 64, 128. You should do that in the reverse manner. Start from input_shape=(580,360) and then you may go, let us say to shapes 256, 128, 64, 32 for Conv2D.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM