So I train my model with a dataset and for each epoch I can see the loss
and val_loss
go down (it is important to note that val_loss
will go down to a certain point, but then it stagnates as well, having some minor ups and downs) and accuracy
go up but for some reason my val_accuracy
stays at roughly 0.33. I browsed this and it seems to be a problem of overfitting so i added Dropout
layers and regularization by using l2
on some layers of the model but it seems to have no effect. Therefore I would like to ask you what do you think I could improve in my model in order to make the val_loss
keep going down and my val_accuracy
not stagnate and therefore keep going up.
I've tried to use more images but the problem seems to be the same.. Not sure if my increment of images was enough tho.
Should I add Dropout
layers in the Conv2D
layers? Should I use less or more l2
regularization? Should I use even more images? Just some questions that might have something to do with my problem.
My model is below:
model = Sequential()
model.add(Conv2D(16, kernel_size=(3, 3), input_shape=(580, 360, 1), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(16, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.02)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.05)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # Flattening the 2D arrays for fully connected layers
model.add(Dense(532, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(266, activation='softmax'))
model.add(Reshape((7, 38)))
print(model.summary())
optimizer = keras.optimizers.SGD(lr=0.00001)
model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['accuracy'])
Thanks in advance!
PS: Here is the graph of training:
PS2: Here is the end of training:
209/209 [==============================] - 68s 327ms/step - loss: 0.7421 - accuracy: 0.9160 - val_loss: 3.8159 - val_accuracy: 0.3152
Epoch 40/40
This seems to be a classic overfitting problem.
It would be nice to have a more detailed introduction to the problem, like is it a classification task? Are your images grayscale? What is the purpose of this network?
With this information I would say that any proper regularization to the network should help. Some item you could try:
Hope this helps!
Just a hint:
You have a problem with your CNN architecture, the size must be lower and lower at each convolution, but in your case it is growing: you have 16, 32, 64, 64, 128. You should do that in the reverse manner. Start from input_shape=(580,360) and then you may go, let us say to shapes 256, 128, 64, 32 for Conv2D.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.