简体   繁体   中英

Simple tf.keras Resnet50 model not converging

I'm using the ResNet50v2 model from keras.applications for image classification but I have had persisting problems trying to get the model to converge on any meaningful accuracy. Previously, I have developed this same model with the same data in Matlab and reached around 75% accuracy but now the training just hovers around 30% accuracy and the loss does not drop. I'm thinking that there is a really simple mistake somewhere but I can't find it.

import tensorflow as tf

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./224,
    validation_split=0.2)

train_generator = train_datagen.flow_from_directory(main_dir,
                                                    class_mode='categorical',
                                                    batch_size=32,
                                                    target_size=(224,224),
                                                    shuffle=True,
                                                    subset='training')

validation_generator = train_datagen.flow_from_directory(main_dir,
                                                        target_size=(224, 224),
                                                        batch_size=32,
                                                        class_mode='categorical',
                                                        shuffle=True,
                                                        subset='validation')

IMG_SHAPE = (224, 224, 3)

base_model = tf.keras.applications.ResNet50V2(
    input_shape=IMG_SHAPE,
    include_top=False,
    weights='imagenet')

maxpool_layer = tf.keras.layers.GlobalMaxPooling2D()
prediction_layer = tf.keras.layers.Dense(4, activation='softmax')

model = tf.keras.Sequential([
    base_model,
    maxpool_layer,
    prediction_layer
])

opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_generator,
    steps_per_epoch = train_generator.samples // 32,
    validation_data = validation_generator,
    validation_steps = validation_generator.samples // 32,
    epochs = 20)

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True . However, if you didn't have a softmax activation, you would need from_logits=True . This is because categorical_crossentropy handles probability outputs differently from logits.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM