Keras L2 regularization makes the network not learn

Question

I am trying to train a simple model for the MNIST dataset. A single hidden layer of 36 neurons.

NUM_CLASSES = 10
BATCH_SIZE = 128
EPOCHS = 100

model = models.Sequential([
    layers.Input(shape = x_train.shape[1:]),

    layers.Dense(units = 36, activation = activations.sigmoid, kernel_regularizer = regularizers.l2(0.0001)),
    layers.Dropout(0.5),

    layers.Dense(units = NUM_CLASSES, activation = activations.softmax)
])

model.summary()

model.compile(loss      = losses.CategoricalCrossentropy(),
              optimizer = optimizers.RMSprop(),
              metrics   = ['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size = BATCH_SIZE,
                    epochs = EPOCHS,
                    verbose = 2,
                    validation_data = (x_val, y_val))

Without the l2 part everything works but as soon as I try to use regularization it all goes sideways and the accuracy consistently stays at 10%:

Epoch 1/300
391/391 - 1s - loss: 2.4411 - accuracy: 0.0990 - val_loss: 2.3027 - val_accuracy: 0.1064

Epoch 2/300
391/391 - 0s - loss: 2.3374 - accuracy: 0.1007 - val_loss: 2.3031 - val_accuracy: 0.1064

Epoch 3/300
391/391 - 0s - loss: 2.3178 - accuracy: 0.1016 - val_loss: 2.3041 - val_accuracy: 0.1064

Epoch 4/300
391/391 - 0s - loss: 2.3089 - accuracy: 0.1045 - val_loss: 2.3026 - val_accuracy: 0.1064

Epoch 5/300
391/391 - 0s - loss: 2.3051 - accuracy: 0.1060 - val_loss: 2.3030 - val_accuracy: 0.1064

This happens both when I manually give regularizers.l2 as an argument and when I give "l2" as an argument.

Why exactly does this happen and what am I doing wrong?

Answer 1

I suspect with a high dropout rate of.5 adding regularization will prevent the network from learning. Both dropout and regularization are a means to prevent over fitting. Try using a lower dropout rate with regularization and see if the network trains properly. My experience to date is that dropout is more effective at controlling over fitting than regularizers.

Keras L2 regularization makes the network not learn

Question

1 answers

solution1
-1 2020-08-11 15:44:43

Keras L2 regularization makes the network not learn

Question

1 answers

solution1 -1 2020-08-11 15:44:43

solution1
-1 2020-08-11 15:44:43