Training accuracy is less than validation accuracy

Question

I have created a CNN model for classifying text data. Please help me interpret my result and tell me why is my training accuracy less than validation accuracy?

I have a total of 2619 Data, all of them are text data. There are two different classes. Here is a sample of my dataset.

The validation set has 34 data. Rest of 2619 data are training data.

I have done RepeatedKfold cross-validation. Here is my code.

from sklearn.model_selection import RepeatedKFold 
kf = RepeatedKFold(n_splits=75, n_repeats=1, random_state= 42) 

for train_index, test_index in kf.split(X,Y):
      #print("Train:", train_index, "Validation:",test_index)
      x_train, x_test = X.iloc[train_index], X.iloc[test_index] 
      y_train, y_test = Y.iloc[train_index], Y.iloc[test_index]

I have used CNN. Here is my model.

model = Sequential()
model.add(Embedding(2900,2 , input_length=1))
model.add(Conv1D(filters=2, kernel_size=3, kernel_regularizer=l2(0.0005 ), bias_regularizer=l2(0.0005 ), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1, kernel_regularizer=l2(0.0005 ), bias_regularizer=l2(0.0005 ), activation='sigmoid'))
model.add(Dropout(0.25))
adam = optimizers.Adam(lr = 0.0005, beta_1 = 0.9, beta_2 = 0.999, epsilon = None, decay = 0.0, amsgrad = False)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
print(model.summary())
history = model.fit(x_train, y_train, epochs=300,validation_data=(x_test, y_test), batch_size=128, shuffle=False)
# Final evaluation of the model
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

And here is the result.

Epoch 295/300
2585/2585 [==============================] - 0s 20us/step - loss: 1.6920 - acc: 0.7528 - val_loss: 0.5839 - val_acc: 0.8235
Epoch 296/300
2585/2585 [==============================] - 0s 20us/step - loss: 1.6532 - acc: 0.7617 - val_loss: 0.5836 - val_acc: 0.8235
Epoch 297/300
2585/2585 [==============================] - 0s 27us/step - loss: 1.5328 - acc: 0.7551 - val_loss: 0.5954 - val_acc: 0.8235
Epoch 298/300
2585/2585 [==============================] - 0s 20us/step - loss: 1.6289 - acc: 0.7524 - val_loss: 0.5897 - val_acc: 0.8235
Epoch 299/300
2585/2585 [==============================] - 0s 21us/step - loss: 1.7000 - acc: 0.7582 - val_loss: 0.5854 - val_acc: 0.8235
Epoch 300/300
2585/2585 [==============================] - 0s 25us/step - loss: 1.5475 - acc: 0.7451 - val_loss: 0.5934 - val_acc: 0.8235
Accuracy: 82.35%

Please help me with my problem. Thank you.

Answer 1

You may have too much regularization for your model causing it to underfit your data.
A good way to start is begin with no regularization at all (no Dropout, no weights decay, ..) and look if it's overfitting:

If not, regularization is useless
If it's overfitting, add regularization little by little, start by small dropout / weights decay, and then icrease it if it's continue to overfit

Moroever, don't put Dropout as final layer, and don't put two Dropout layers successively.

Answer 2

Your training accuracy is less than validation accuracy likely because of using dropout: it "turns off" some neurons during training to prevent overfitting. During validation dropout is off, so your network uses all its neurons, thus making (in that particular case) more accurate predictions.

In general, I agree with advice by Thibault Bacqueyrisses and want to add that it's also usually a bad practice to put dropout before batch normalization (which is not about this particular case anyway).

Training accuracy is less than validation accuracy

Question

2 answers

solution1
2 2020-02-18 15:01:22

solution2
0 2020-02-18 16:09:34

Training accuracy is less than validation accuracy

Question

2 answers

solution1 2 2020-02-18 15:01:22

solution2 0 2020-02-18 16:09:34

solution1
2 2020-02-18 15:01:22

solution2
0 2020-02-18 16:09:34