简体   繁体   中英

How can I intentionally overfit a convolutional neural net in Keras to make sure the model is working?

I'm trying to diagnose what's causing low accuracies when training my model. At this point, I just want to be able to get to high training accuracies (I can worry about testing accuracy/overfitting problems later). How can I adjust the model to overindex on training accuracy? I want to do this to make sure I didn't make any mistakes in a preprocessing step (shuffling, splitting, normalizing, etc.).

#PARAMS
dropout_prob = 0.2
activation_function = 'relu'
loss_function = 'categorical_crossentropy'
verbose_level = 1
convolutional_batches = 32
convolutional_epochs = 5
inp_shape = X_train.shape[1:]
num_classes = 3


def train_convolutional_neural():
    y_train_cat = np_utils.to_categorical(y_train, 3) 
    y_test_cat = np_utils.to_categorical(y_test, 3)

    model = Sequential()
    model.add(Conv2D(filters=16, kernel_size=(3, 3), input_shape=inp_shape))
    model.add(Conv2D(filters=32, kernel_size=(3, 3)))
    model.add(MaxPooling2D(pool_size = (2,2)))
    model.add(Dropout(rate=dropout_prob))
    model.add(Flatten())
    model.add(Dense(64,activation=activation_function))
    model.add(Dense(num_classes,activation='softmax'))
    model.summary()
    model.compile(loss=loss_function, optimizer="adam", metrics=['accuracy'])
    history = model.fit(X_train, y_train_cat, batch_size=convolutional_batches, epochs = convolutional_epochs, verbose = verbose_level, validation_data=(X_test, y_test_cat))
    model.save('./models/convolutional_model.h5')

You need to remove the Dropout layer. Here is a small checklist for intentional overfitting:

  • Remove any regularizations (Dropout, L1 and L2 regularization)
  • Make sure to set slower learning rate (Adam is adaptive, so in your case it is fine)
  • You may want to not shuffle the training samples (eg all the first 100 samples are class A, the next 100 are class B, the last 100 are class C). Update : as pointed out by petezurich in the answer below, this should be considered with care as it could lead to no training effect at all.

Now, if you model overfit easily, then it is a good sign of a strong model, capable of representing the data. Otherwise, you may consider a deeper/wider model, or you should take a good look at the data and ask the question: "Are there really any pattenrs? Is this trainable?".

In addition to the other valid answers – one very simple way to overfit is to use only a small subset of your data. Eg only 1 or 2 samples.

See also this extremely helpful post regarding everything that you can check to make sure your model is working: https://blog.slavv.com/37-reasons-why-your-neural-network-is-not-working-4020854bd607

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM