I am trying to train a simple MobileNetV3Small
under keras.applications
as shown below
base_model = keras.applications.MobileNetV3Small(
input_shape= INPUT_SHAPE,
alpha=.125,
include_top=False,
classes=1,
dropout_rate = 0.2,
weights=None)
x = keras.layers.Flatten()(base_model.output)
preds = keras.layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=base_model.input, outputs=preds)
model.compile(loss="binary_crossentropy",
optimizer='RMSprop',
metrics=["binary_accuracy"])
train_datagen = ImageDataGenerator(
rescale=1.0 / 255,
rotation_range=40,
horizontal_flip=True,
vertical_flip=True,
)
train_generator = train_datagen.flow_from_directory(
os.path.join(DATA_ROOT, 'train'),
target_size=(56,56),
batch_size=128,
class_mode="binary",
)
validation_datagen = ImageDataGenerator(rescale=1.0 / 255)
validation_generator = validation_datagen.flow_from_directory(
os.path.join(DATA_ROOT, 'val'),
target_size=(56,56),
batch_size=128,
class_mode="binary",
)
model_checkpoint_callback = keras.callbacks.ModelCheckpoint(
filepath=SAVE_DIR,
save_weights_only=True,
monitor='val_binary_accuracy',
mode='max',
save_best_only=True)
es_callback = keras.callbacks.EarlyStopping(patience=10)
model.fit(train_generator,
epochs=100,
validation_data=validation_generator,
callbacks=[model_checkpoint_callback, es_callback],
shuffle=True)
When I train the model I got validation accuracy around 0.94. But when I call model.evaluate
on the exact same validation data, the accuracy becomes 0.48. When I call model.predict
with any data it outputs constant value 0.51...
There is nothing wrong with learning rate, optimizer or metrics. What could be wrong here?
EDIT:
After training when I run
pred_results = model.evaluate(validation_generator)
print(pred_results)
it gives me the output for 1 epoch trained network:
6/6 [==============================] - 1s 100ms/step - loss: 0.6935 - binary_accuracy: 0.8461
However, when I save and load the model with either model.save()
or tf.keras.models.save_model()
. The output becomes something like this:
6/6 [==============================] - 2s 100ms/step - loss: 0.6935 - binary_accuracy: 0.5028 [0.6935192346572876, 0.5027709603309631]
and output of the model.predict(validation_generator)
is:
[[0.5080832] [0.5080832] [0.5080832] [0.5080832] . . . [0.5080832] [0.5080832]]
What I've tried so far:
tf.keras.utils.image_dataset_from_directory()
instead of ImageDataGenerator
momentum
parameter of MobileNet BatchNormalization layers one by one.for layer in model.layers[0].layers:
if type(layer) is tf.keras.layers.BatchNormalization:
layer.momentum = 0.9
First two moves do not have an effect, the after applying the third step, I get no longer same predictions for any input. However, evaluate()
and predict()
still have different accuracy values.
It might be worth trying model.save_weights('directory')
and then rebuilding your model (i think here that is re-running the base_model = ...
code) through model.load_weights('directory')
. That is what i do in my own models, and when i then do that, the accuracy/loss stay the exact same before and after saving and loading.
If you run pred_results = model.evaluate(validation_generator)
after you fit the model, the loaded weights at this moment are the ones updated on last training epoch. What you have to do is after model.fit
is loading the weights saved from model_checkpoint_callback
with something like
model = model.load_weights(SAVE_DIR)` # and then .evaluate
pred_results = model.evaluate(validation_generator)
print(pred_results)
Have you tried setting shuffle = False
in validation_datagen.flow_from_directory()
? It's a little misleading but the .flow_from_directory()
method shuffles by default, which is problematic when generating your validation dataset. This is shuffling your validation data when you try to call .predict
. Whereas in your training loop, the .fit
method implicitly DOESN'T shuffle the validation set.
The reason I think this is the issue, is because you state that calling .predict()
on the validation set nets you ~.5 accuracy, and you're also running a binary classification (sigmoid output with binary cross entropy loss), which makes perfect sense IF you're (mistakenly) shuffling your validation data. Untrained binary classifiers on balanced datasets will usually do around 50% accuracy (.5 for 0, .5 for 1) since it's just guessing at that point.
Source : I've built and trained a lot of image classification models before, and this happened to me a lot.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.