简体   繁体   English

简单的 tf.keras Resnet50 model 不收敛

[英]Simple tf.keras Resnet50 model not converging

I'm using the ResNet50v2 model from keras.applications for image classification but I have had persisting problems trying to get the model to converge on any meaningful accuracy.我正在使用来自 keras.applications 的 ResNet50v2 keras.applications进行图像分类,但我在尝试让 model 收敛到任何有意义的精度时遇到了持续的问题。 Previously, I have developed this same model with the same data in Matlab and reached around 75% accuracy but now the training just hovers around 30% accuracy and the loss does not drop.以前,我开发了同样的 model,在 Matlab 中使用相同的数据,并达到了 75% 左右的准确率,但现在训练的准确率仅徘徊在 30% 左右,并且损失没有下降。 I'm thinking that there is a really simple mistake somewhere but I can't find it.我在想某处有一个非常简单的错误,但我找不到。

import tensorflow as tf

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./224,
    validation_split=0.2)

train_generator = train_datagen.flow_from_directory(main_dir,
                                                    class_mode='categorical',
                                                    batch_size=32,
                                                    target_size=(224,224),
                                                    shuffle=True,
                                                    subset='training')

validation_generator = train_datagen.flow_from_directory(main_dir,
                                                        target_size=(224, 224),
                                                        batch_size=32,
                                                        class_mode='categorical',
                                                        shuffle=True,
                                                        subset='validation')

IMG_SHAPE = (224, 224, 3)

base_model = tf.keras.applications.ResNet50V2(
    input_shape=IMG_SHAPE,
    include_top=False,
    weights='imagenet')

maxpool_layer = tf.keras.layers.GlobalMaxPooling2D()
prediction_layer = tf.keras.layers.Dense(4, activation='softmax')

model = tf.keras.Sequential([
    base_model,
    maxpool_layer,
    prediction_layer
])

opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_generator,
    steps_per_epoch = train_generator.samples // 32,
    validation_data = validation_generator,
    validation_steps = validation_generator.samples // 32,
    epochs = 20)

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True .由于您的最后一层包含softmax激活,因此您的损失不需要from_logits=True However, if you didn't have a softmax activation, you would need from_logits=True .但是,如果您没有softmax激活,则需要from_logits=True This is because categorical_crossentropy handles probability outputs differently from logits.这是因为categorical_crossentropy处理概率输出的方式与 logits 不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM