Simple tf.keras Resnet50 model not converging

Question

I'm using the ResNet50v2 model from keras.applications for image classification but I have had persisting problems trying to get the model to converge on any meaningful accuracy. Previously, I have developed this same model with the same data in Matlab and reached around 75% accuracy but now the training just hovers around 30% accuracy and the loss does not drop. I'm thinking that there is a really simple mistake somewhere but I can't find it.

import tensorflow as tf

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./224,
    validation_split=0.2)

train_generator = train_datagen.flow_from_directory(main_dir,
                                                    class_mode='categorical',
                                                    batch_size=32,
                                                    target_size=(224,224),
                                                    shuffle=True,
                                                    subset='training')

validation_generator = train_datagen.flow_from_directory(main_dir,
                                                        target_size=(224, 224),
                                                        batch_size=32,
                                                        class_mode='categorical',
                                                        shuffle=True,
                                                        subset='validation')

IMG_SHAPE = (224, 224, 3)

base_model = tf.keras.applications.ResNet50V2(
    input_shape=IMG_SHAPE,
    include_top=False,
    weights='imagenet')

maxpool_layer = tf.keras.layers.GlobalMaxPooling2D()
prediction_layer = tf.keras.layers.Dense(4, activation='softmax')

model = tf.keras.Sequential([
    base_model,
    maxpool_layer,
    prediction_layer
])

opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_generator,
    steps_per_epoch = train_generator.samples // 32,
    validation_data = validation_generator,
    validation_steps = validation_generator.samples // 32,
    epochs = 20)

Answer 1

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True . However, if you didn't have a softmax activation, you would need from_logits=True . This is because categorical_crossentropy handles probability outputs differently from logits.

Simple tf.keras Resnet50 model not converging

Question

1 answers

solution1
0 2020-06-29 23:18:54

Simple tf.keras Resnet50 model not converging

Question

1 answers

solution1 0 2020-06-29 23:18:54

solution1
0 2020-06-29 23:18:54