为什么我的 val_loss 下降但我的 val_accuracy 停滞不前

Question

So I train my model with a dataset and for each epoch I can see the loss and val_loss go down (it is important to note that val_loss will go down to a certain point, but then it stagnates as well, having some minor ups and downs) and accuracy go up but for some reason my val_accuracy stays at roughly 0.33. So I train my model with a dataset and for each epoch I can see the loss and val_loss go down (it is important to note that val_loss will go down to a certain point, but then it stagnates as well, having some minor ups and downs ) 和accuracy go 上升，但由于某种原因，我的val_accuracy保持在大约 0.33。 I browsed this and it seems to be a problem of overfitting so i added Dropout layers and regularization by using l2 on some layers of the model but it seems to have no effect.我浏览了这个，这似乎是一个过度拟合的问题，所以我通过在 model 的某些层上使用l2添加了Dropout层和正则化，但它似乎没有效果。 Therefore I would like to ask you what do you think I could improve in my model in order to make the val_loss keep going down and my val_accuracy not stagnate and therefore keep going up.因此，我想问您，您认为我可以在 model 中进行哪些改进，以使val_loss继续下降，而我的val_accuracy不会停滞并因此继续上升。

I've tried to use more images but the problem seems to be the same.. Not sure if my increment of images was enough tho.我尝试使用更多图像，但问题似乎是相同的.. 不确定我的图像增量是否足够。

Should I add Dropout layers in the Conv2D layers?我应该在Conv2D层中添加Dropout层吗？ Should I use less or more l2 regularization?我应该使用更少还是更多的l2正则化？ Should I use even more images?我应该使用更多图像吗？ Just some questions that might have something to do with my problem.只是一些可能与我的问题有关的问题。

My model is below:我的 model 如下：

model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), input_shape=(580, 360, 1), padding='same', activation='relu'))
model.add(BatchNormalization())

model.add(Conv2D(16, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Flatten())  # Flattening the 2D arrays for fully connected layers

model.add(Dense(532, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(266, activation='softmax'))
model.add(Reshape((7, 38)))

print(model.summary())
optimizer = keras.optimizers.SGD(lr=0.00001)
model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])

Thanks in advance!提前致谢！

PS: Here is the graph of training: PS：训练图如下：

PS2: Here is the end of training: PS2：训练到此结束：

209/209 [==============================] - 68s 327ms/step - loss: 0.7421 - accuracy: 0.9160 - val_loss: 3.8159 - val_accuracy: 0.3152
Epoch 40/40

Answer 1

This seems to be a classic overfitting problem.这似乎是一个经典的过拟合问题。

It would be nice to have a more detailed introduction to the problem, like is it a classification task?对这个问题有更详细的介绍会很好，比如它是一个分类任务吗？ Are your images grayscale?你的图像是灰度的吗？ What is the purpose of this network?这个网络的目的是什么？

With this information I would say that any proper regularization to the network should help.有了这些信息，我想说对网络进行任何适当的正则化都会有所帮助。 Some item you could try:您可以尝试的一些项目：

For conv layers I recommend using SpatialDropout layers .对于 conv 层，我建议使用SpatialDropout 层。
Get more data (if possible)获取更多数据（如果可能）
Use data augmentation (if possible)使用数据增强（如果可能）
Increase the rate of the dropout layers增加 dropout 层的速率
Try reducing the complexity of your model architecture (maybe fewer layers, fewer number of filters in general, fewer number of neurons in dense layers, etc.)尝试降低 model 架构的复杂性（可能层数更少，一般滤波器数量更少，密集层中的神经元数量更少等）

Hope this helps!希望这可以帮助！

Answer 2

Just a hint:只是一个提示：

You have a problem with your CNN architecture, the size must be lower and lower at each convolution, but in your case it is growing: you have 16, 32, 64, 64, 128. You should do that in the reverse manner.您的 CNN 架构有问题，每次卷积的大小必须越来越小，但在您的情况下，它正在增长：您有 16、32、64、64、128。您应该以相反的方式进行操作。 Start from input_shape=(580,360) and then you may go, let us say to shapes 256, 128, 64, 32 for Conv2D.从 input_shape=(580,360) 开始，然后你可以 go，让我们说 Conv2D 的形状为 256、128、64、32。

为什么我的 val_loss 下降但我的 val_accuracy 停滞不前

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-05-10 18:40:13

解决方案2
1 2020-05-10 18:02:01

为什么我的 val_loss 下降但我的 val_accuracy 停滞不前

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-05-10 18:40:13

解决方案2 1 2020-05-10 18:02:01

解决方案1
2 已采纳 2020-05-10 18:40:13

解决方案2
1 2020-05-10 18:02:01