简体   繁体   中英

How can I use data augmentation in keras and how do I prevent overfitting on the mnist dataset?

I want to train a keras neural network on the mnist dataset. The problem is that my model already overfits after 1 or 2 epochs. To combat this problem, I wanted to use data augmentation:

First I load the data:

#load mnist dataset
(tr_images, tr_labels), (test_images, test_labels) = mnist.load_data()

#normalize images
tr_images, test_images = preprocess(tr_images, test_images)

#function which returns the amount of train images, test images and classes
amount_train_images, amount_test_images, total_classes = get_data_information(tr_images, tr_labels, test_images, test_labels)

#convert labels into the respective vectors
tr_vector_labels = keras.utils.to_categorical(tr_labels, total_classes) 
test_vector_labels = keras.utils.to_categorical(test_labels, total_classes)

I create a model with a "create_model" function:

untrained_model = create_model()

This is the function definition:

def create_model(_learning_rate=0.01, _momentum=0.9, _decay=0.001, _dense_neurons=128, _fully_connected_layers=3, _loss="sparse_categorical_crossentropy", _dropout=0.1):
    #create model
    model = keras.Sequential()

    #input
    model.add(Flatten(input_shape=(28, 28)))
    
    #add fully connected layers
    for i in range(_fully_connected_layers):
        model.add(Dense(_dense_neurons, activation='relu'))

    model.add(Dropout(_dropout))

    #classifier
    model.add(Dense(total_classes, activation='sigmoid'))

    optimizer = keras.optimizers.SGD(
        learning_rate=_learning_rate,
        momentum=_momentum,
        decay=_decay
    )

    #compile
    model.compile(
        optimizer=optimizer,
        loss=_loss,
        metrics=['accuracy']
    )

    return model

The function returns a compiled but untrained model. I also use this function when I try to optimize the hyperparameters (hence the many parameters). Then I create an ImagaDataGenerator:

generator = tf.keras.preprocessing.image.ImageDataGenerator(
                rotation_range=0.15,
                width_shift_range=0.15,
                height_shift_range=0.15,
                zoom_range=0.15
            )

Now I want to train the model with my train_model_with_data_augmentation function:

train_model_with_data_augmentation(
                tr_images=tr_images, 
                tr_labels=tr_labels, 
                test_images=test_images, 
                test_labels=test_labels, 
                model=untrained_model,
                generator=generator,
                hyperparameters=hyperparameters
            )

However, I don't know how to use this generator for the model I've created because the only method I've found was the fit method of the generator but I want to train my model and not the generator.

Here is the graph that I get from the training history: https://ibb.co/sKFnwGr

  1. Can I somehow convert the generator to data that I can use as parameters in the fit method of the model?
  2. If not: How can I train the model I've created with this generator? (or do I have to implement data augmentation in a completely different way?)
  3. Does data augmentation even make sense with the mnist dataset?
  4. What other options are there to prevent overfitting on mnist?

Update: I tried to use this code:

generator.fit(x_train)
model.fit(generator.flow(x_train, y_train, batch_size=32), steps_per_epoch=len(x_train)/32, epochs=epochs)

However I get this error message: ValueError: "Input to .fit() should have rank 4. Got array with shape: (60000, 28, 28)"

I believe the input matrix of the fit method should contain Image Index, height, widht, depth so it should have 4 dimensions while my x_train array only has 3 dimensions and doesn't have any dimension about the depth of the image. I tried to expand it:

x_train = x_train[..., np.newaxis]
y_train = y_train[..., np.newaxis]

But then I get this error message: "Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated."

Working example of using ImageDataGenerator can be found here . The example itself:

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)

datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(x_train)

# fits the model on batches with real-time data augmentation:
model.fit(datagen.flow(x_train, y_train, batch_size=32),
          steps_per_epoch=len(x_train) / 32, epochs=epochs)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM