[英]Why is my val_loss going down but my val_accuracy stagnates
So I train my model with a dataset and for each epoch I can see the loss
and val_loss
go down (it is important to note that val_loss
will go down to a certain point, but then it stagnates as well, having some minor ups and downs) and accuracy
go up but for some reason my val_accuracy
stays at roughly 0.33. So I train my model with a dataset and for each epoch I can see the loss
and val_loss
go down (it is important to note that val_loss
will go down to a certain point, but then it stagnates as well, having some minor ups and downs ) 和accuracy
go 上升,但由于某种原因,我的val_accuracy
保持在大约 0.33。 I browsed this and it seems to be a problem of overfitting so i added Dropout
layers and regularization by using l2
on some layers of the model but it seems to have no effect.我浏览了这个,这似乎是一个过度拟合的问题,所以我通过在 model 的某些层上使用l2
添加了Dropout
层和正则化,但它似乎没有效果。 Therefore I would like to ask you what do you think I could improve in my model in order to make the val_loss
keep going down and my val_accuracy
not stagnate and therefore keep going up.因此,我想问您,您认为我可以在 model 中进行哪些改进,以使val_loss
继续下降,而我的val_accuracy
不会停滞并因此继续上升。
I've tried to use more images but the problem seems to be the same.. Not sure if my increment of images was enough tho.我尝试使用更多图像,但问题似乎是相同的.. 不确定我的图像增量是否足够。
Should I add Dropout
layers in the Conv2D
layers?我应该在Conv2D
层中添加Dropout
层吗? Should I use less or more l2
regularization?我应该使用更少还是更多的l2
正则化? Should I use even more images?我应该使用更多图像吗? Just some questions that might have something to do with my problem.只是一些可能与我的问题有关的问题。
My model is below:我的 model 如下:
model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), input_shape=(580, 360, 1), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(16, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # Flattening the 2D arrays for fully connected layers
model.add(Dense(532, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(266, activation='softmax'))
model.add(Reshape((7, 38)))
print(model.summary())
optimizer = keras.optimizers.SGD(lr=0.00001)
model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])
Thanks in advance!提前致谢!
PS: Here is the graph of training: PS:训练图如下:
PS2: Here is the end of training: PS2:训练到此结束:
209/209 [==============================] - 68s 327ms/step - loss: 0.7421 - accuracy: 0.9160 - val_loss: 3.8159 - val_accuracy: 0.3152
Epoch 40/40
This seems to be a classic overfitting problem.这似乎是一个经典的过拟合问题。
It would be nice to have a more detailed introduction to the problem, like is it a classification task?对这个问题有更详细的介绍会很好,比如它是一个分类任务吗? Are your images grayscale?你的图像是灰度的吗? What is the purpose of this network?这个网络的目的是什么?
With this information I would say that any proper regularization to the network should help.有了这些信息,我想说对网络进行任何适当的正则化都会有所帮助。 Some item you could try:您可以尝试的一些项目:
Hope this helps!希望这可以帮助!
Just a hint:只是一个提示:
You have a problem with your CNN architecture, the size must be lower and lower at each convolution, but in your case it is growing: you have 16, 32, 64, 64, 128. You should do that in the reverse manner.您的 CNN 架构有问题,每次卷积的大小必须越来越小,但在您的情况下,它正在增长:您有 16、32、64、64、128。您应该以相反的方式进行操作。 Start from input_shape=(580,360) and then you may go, let us say to shapes 256, 128, 64, 32 for Conv2D.从 input_shape=(580,360) 开始,然后你可以 go,让我们说 Conv2D 的形状为 256、128、64、32。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.