简体   繁体   中英

Tensorflow: loss and accuracy stay flat training CNN on image classification

I copied / pasted this Tensorflow tutorial into a Jupyter notebook. (As of this writting they changed the tutorial to the flower data set instead of the dog one, but the question still applies). https://www.tensorflow.org/tutorials/images/classification

The first part (without augmentation) runs fine and I get similar results.

But with data augmentation, my Loss and Accuracy stay flat across all epoch. I've checked this posts already on SO: Keras accuracy does not change How to fix flatlined accuracy and NaN loss in tensorflow image classification Tensorflow: loss decreasing, but accuracy stable

None of this applied, since the dataset is a standard one, I don't have the problem of corrupted data, plus I printed a couple of images augmented and it works fine (see below).

I've tried adding more fully connected layers to increase the model capacity, dropout to limit over fitting,... nothing change here are the curve:

Any ideas as to why? Have I missed something in the code? I know training a DL model is a lot of trial and error, but I'm sure there must be some logic or intuition beyond randomly turning the knobs until something happens.

Thanks !

在此处输入图像描述

Source Data: _URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'

path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)

PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

Params:

batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150

Preprocessing stage:

image_gen = ImageDataGenerator(rescale=1./255,
    rotation_range=20,
    width_shift_range=0.15,
    height_shift_range=0.15,
    horizontal_flip=True,
    zoom_range=0.2)

train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
                                               directory=train_dir,
                                               shuffle=True,
                                               target_size=(IMG_HEIGHT, IMG_WIDTH))

augmented_images = [train_data_gen[0][0][i] for i in range(5)]
plotImages(augmented_images)

image_gen_val = ImageDataGenerator(rescale=1./255)

val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
                                                 directory=validation_dir,
                                                 target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                 class_mode='binary')

在此处输入图像描述

Model:

model_new = Sequential([
    Conv2D(16, 2, padding='same', activation='relu', 
           input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
    MaxPooling2D(),
    Conv2D(32, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Dropout(0.2),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1)
])

model_new.compile(optimizer='adam',
                  loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
                  metrics=['accuracy'])

model_new.summary()

history = model_new.fit(
    train_data_gen,
    steps_per_epoch= total_train // batch_size,
    epochs=epochs,
    validation_data=val_data_gen,
    validation_steps= total_val // batch_size
)

As suggested by @today, class_method= 'binary' was missing from the training data generator Now the model is able to train properly.

train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
                                               directory=train_dir,
                                               shuffle=True,
                                               target_size=(IMG_HEIGHT, IMG_WIDTH),
                                               class_method = 'binary')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM