简体   繁体   中英

Keras L2 regularization makes the network not learn

I am trying to train a simple model for the MNIST dataset. A single hidden layer of 36 neurons.

NUM_CLASSES = 10
BATCH_SIZE = 128
EPOCHS = 100

model = models.Sequential([
    layers.Input(shape = x_train.shape[1:]),

    layers.Dense(units = 36, activation = activations.sigmoid, kernel_regularizer = regularizers.l2(0.0001)),
    layers.Dropout(0.5),

    layers.Dense(units = NUM_CLASSES, activation = activations.softmax)
])

model.summary()

model.compile(loss      = losses.CategoricalCrossentropy(),
              optimizer = optimizers.RMSprop(),
              metrics   = ['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size = BATCH_SIZE,
                    epochs = EPOCHS,
                    verbose = 2,
                    validation_data = (x_val, y_val))

Without the l2 part everything works but as soon as I try to use regularization it all goes sideways and the accuracy consistently stays at 10%:

Epoch 1/300
391/391 - 1s - loss: 2.4411 - accuracy: 0.0990 - val_loss: 2.3027 - val_accuracy: 0.1064

Epoch 2/300
391/391 - 0s - loss: 2.3374 - accuracy: 0.1007 - val_loss: 2.3031 - val_accuracy: 0.1064

Epoch 3/300
391/391 - 0s - loss: 2.3178 - accuracy: 0.1016 - val_loss: 2.3041 - val_accuracy: 0.1064

Epoch 4/300
391/391 - 0s - loss: 2.3089 - accuracy: 0.1045 - val_loss: 2.3026 - val_accuracy: 0.1064

Epoch 5/300
391/391 - 0s - loss: 2.3051 - accuracy: 0.1060 - val_loss: 2.3030 - val_accuracy: 0.1064

This happens both when I manually give regularizers.l2 as an argument and when I give "l2" as an argument.

Why exactly does this happen and what am I doing wrong?

I suspect with a high dropout rate of.5 adding regularization will prevent the network from learning. Both dropout and regularization are a means to prevent over fitting. Try using a lower dropout rate with regularization and see if the network trains properly. My experience to date is that dropout is more effective at controlling over fitting than regularizers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM