简单的 tf.keras Resnet50 model 不收敛

Question

I'm using the ResNet50v2 model from keras.applications for image classification but I have had persisting problems trying to get the model to converge on any meaningful accuracy.我正在使用来自 keras.applications 的 ResNet50v2 keras.applications进行图像分类，但我在尝试让 model 收敛到任何有意义的精度时遇到了持续的问题。 Previously, I have developed this same model with the same data in Matlab and reached around 75% accuracy but now the training just hovers around 30% accuracy and the loss does not drop.以前，我开发了同样的 model，在 Matlab 中使用相同的数据，并达到了 75% 左右的准确率，但现在训练的准确率仅徘徊在 30% 左右，并且损失没有下降。 I'm thinking that there is a really simple mistake somewhere but I can't find it.我在想某处有一个非常简单的错误，但我找不到。

import tensorflow as tf

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./224,
    validation_split=0.2)

train_generator = train_datagen.flow_from_directory(main_dir,
                                                    class_mode='categorical',
                                                    batch_size=32,
                                                    target_size=(224,224),
                                                    shuffle=True,
                                                    subset='training')

validation_generator = train_datagen.flow_from_directory(main_dir,
                                                        target_size=(224, 224),
                                                        batch_size=32,
                                                        class_mode='categorical',
                                                        shuffle=True,
                                                        subset='validation')

IMG_SHAPE = (224, 224, 3)

base_model = tf.keras.applications.ResNet50V2(
    input_shape=IMG_SHAPE,
    include_top=False,
    weights='imagenet')

maxpool_layer = tf.keras.layers.GlobalMaxPooling2D()
prediction_layer = tf.keras.layers.Dense(4, activation='softmax')

model = tf.keras.Sequential([
    base_model,
    maxpool_layer,
    prediction_layer
])

opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_generator,
    steps_per_epoch = train_generator.samples // 32,
    validation_data = validation_generator,
    validation_steps = validation_generator.samples // 32,
    epochs = 20)

Answer 1

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True .由于您的最后一层包含softmax激活，因此您的损失不需要from_logits=True 。 However, if you didn't have a softmax activation, you would need from_logits=True .但是，如果您没有softmax激活，则需要from_logits=True 。 This is because categorical_crossentropy handles probability outputs differently from logits.这是因为categorical_crossentropy处理概率输出的方式与 logits 不同。

简单的 tf.keras Resnet50 model 不收敛

问题描述

1 个解决方案

解决方案1
0 2020-06-29 23:18:54

简单的 tf.keras Resnet50 model 不收敛

问题描述

1 个解决方案

解决方案1 0 2020-06-29 23:18:54

解决方案1
0 2020-06-29 23:18:54