How can I prevent Overfitting in this model?

Question

I made this model for an image classification problem. The problem I'm encountering is that the validation accuracy is always from 5-8% lower than the training accuracy and the validation loss is way higher than the training loss. Here's an example of one of my epochs: loss: 0.2232 - acc: 0.9245 - val_loss: 0.4131 - val_acc: 0.8700

model = Sequential()

model.add(Conv2D(32, 3, 3, border_mode='same', input_shape=(150, 
150, 3), activation='relu'))
model.add(Conv2D(32, 3, 3, border_mode='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, 3, 3, border_mode='same', activation='relu'))
model.add(Conv2D(64, 3, 3, border_mode='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, 3, 3, border_mode='same', 
activation='relu'))
model.add(Conv2D(128, 3, 3, border_mode='same', 
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(256, 3, 3, border_mode='same', 
activation='relu'))
model.add(Conv2D(256, 3, 3, border_mode='same', 
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(lr=0.0001),
              metrics=['accuracy'])

# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)

I've tried Bayesian Hyperparameter Optimization using Hyperas, but the model hyperparameters that it's recommending aren't really working for me. What should I change in my model to prevent it from Overfitting? I'm not using much data to train and validate the model because I won't have much data for what the model will be used in real-life. Any recommendation will be greatly appreciated.

Answer 1

Overfitting is one thing and training vs validation error is another.

The fact that your training scores are better than your validation doesn' t mean that you are overfitting. You are overfitting when you validation scores reaches their best and then start to be getting worse with training.

If what you are looking for is better validation score - better model generalization, what you can do is:

increase dropout (your dropout looks good enough but try increasing it and see what will happen,
use more data for training (not possible as you are saying above)
try heavier augmentation
try pre-trained networks
try ensembling
try tta (test time augmentation)
try any other training strategy as cosine annealing, mixup generator, or other generators (not keras) as albumentations

Answer 2

Have you turned off DropOut layer during testing phase?

Since DropOut layers are only used during training phase to prevent overfitting, they're not used in testing phase. That's why Tf.Estimator is famous nowadays, since you can turn off DropOut easier with is_training=True/False

You can turn off with tf.keras.backend.set_learning_phase(0). Please make sure you are using tensorflow.keras from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Dropout, Input, Flatten, there is difference between tf.keras and keras, and tf.keras is better.

If you have turned off, below are my techniques to prevent overfitting:
- Make error analysis. You can refer to Prof.Andrew best material https://www.coursera.org/learn/machine-learning-projects?specialization=deep-learning
- Check test and train set distribution, data augmentation (flip, rotate, ...)
- Increase InputShape for more features. One of the best current techniques is using compounding scaling method from https://arxiv.org/pdf/1905.11946.pdf

Hope this helps! Happy Coding!

How can I prevent Overfitting in this model?

Question

2 answers

solution1
4 ACCPTED 2019-07-17 13:36:31

solution2
2 2019-11-06 16:03:03

How can I prevent Overfitting in this model?

Question

2 answers

solution1 4 ACCPTED 2019-07-17 13:36:31

solution2 2 2019-11-06 16:03:03

solution1
4 ACCPTED 2019-07-17 13:36:31

solution2
2 2019-11-06 16:03:03