I have around 8200 images for face detection task. 4800 of them contain human faces. The other 3400 images contain images of 3D human face masks(which are made of rubber/latex), human cartoon faces, faces of monkeys. I want to detect whether the given image contains a real human face or not.
I have trained numerous networks, changing hyper parameters but every time my training accuracy shots up to over 98% and validation accuracy stays at around 60-70%. I have tried out networks containing 3-5 Conv layers and one FC layers. I used L2 regularization, batch norm, data augmentation and dropout to remove overfitting. I then tried out reducing the learning rate of Adam optimizer as the training progressed. I trained the network for more than 100 epochs and sometimes upto 200 epochs. However, the best validation accuracy(20% of dataset) I could achieve was 71%. Is there anyway out to improve the validation accuracy above 85%? I used the following architecture with input image size of 256*256*3 and trained them with a batch size of 16.
regularizer = tf.keras.regularizers.l2(l=0.005)
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (5, 5),strides=(2, 2), activation='relu', input_shape=(256, 256, 3), kernel_regularizer=regularizer),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(96, (5, 5), padding='same', activation='relu', kernel_regularizer=None),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, (3, 3), padding='same', activation='relu', kernel_regularizer=None),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(256, (3, 3), padding='same', activation='relu', kernel_regularizer=None),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
#tf.keras.layers.Dense(2048, activation='relu', kernel_regularizer=regularizer),
tf.keras.layers.Dense(4096, activation='relu', kernel_regularizer=None),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Dense(1, activation='sigmoid', kernel_regularizer=regularizer)
])
SpatialDropout2D
after all Conv layers.BatchNormalization
after all Conv and Dense layers (except for the last Dense/sigmoid one, obviously).If all of those combined are not enough to get good validation accuracy, then you probably just don't have enough data.
A few tips that probably won't reduce overfitting, but tend to be helpful in general:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.