[英]Good test accuracy but poor confusion matrix results
Ive trained a model to classify 4 types of eye diseases using MobileNet as the pretrained model. I achieved a test accuracy of 94%, but when I look at the confusion matrix, it seems like it isn't doing so well.我训练了一个 model 作为预训练的 model 使用 MobileNet 对 4 种眼病进行分类。我的测试准确率达到了 94%,但是当我查看混淆矩阵时,它似乎做得不太好。 Loss is relatively low on training, validation, and testing.
训练、验证和测试的损失相对较低。 Any suggestions on where I went wrong or if im missing something conceptually?
关于我哪里出错或者我在概念上遗漏了什么的任何建议?
Image_height = 224
Image_width = 224
val_split = 0.20
batches_size = 16
lr = 0.0005
spe = 220
vs = 32
epoch = 6
# Getting the file of the training set and testing set
train_folder = "/content/drive/My Drive/Research/train"
test_folder = "/content/drive/My Drive/Research/test"
#Creating batches
train_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet.preprocess_input,validation_split=val_split) \
.flow_from_directory(directory=train_folder, target_size=(Image_height,Image_width), classes=['CNV','DME','DRUSEN','NORMAL'], batch_size=batches_size,class_mode="categorical",
subset="training")
validation_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet.preprocess_input,validation_split=val_split) \
.flow_from_directory(directory=train_folder, target_size=(Image_height,Image_width), classes=['CNV','DME','DRUSEN','NORMAL'], batch_size=batches_size,class_mode="categorical",
subset="validation")
test_batches = ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet.preprocess_input) \
.flow_from_directory(test_folder, target_size=(Image_height,Image_width),
classes=['CNV','DME','DRUSEN','NORMAL'], batch_size=batches_size,class_mode="categorical")
mobile = tf.keras.applications.mobilenet.MobileNet(include_top=False,
input_shape=(224, 224,3),
pooling='max', weights='imagenet',
alpha=1, depth_multiplier=1,dropout=.5)
x=mobile.layers[-1].output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
predictions=Dense (4, activation='softmax')(x)
model = Model(inputs=mobile.input, outputs=predictions)
for layer in model.layers:
layer.trainable=True
model.compile(Adamax(lr=lr), loss='categorical_crossentropy', metrics=['accuracy'])
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath="/content/drive/My Drive/Research/ModelCheckpoint", monitor='val_loss', verbose=0, save_best_only=True,
save_weights_only=False, mode='auto', save_freq='epoch', options=None)
lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=0, mode="auto",
min_delta=0.00001, cooldown=0, min_lr=0)
callbacks=[checkpoint, lr_adjust]
model.fit(train_batches, steps_per_epoch=spe,
validation_data=validation_batches,validation_steps=vs, epochs=epoch)
# Predict the accuracy on the Test set
acc = model.evaluate_generator(test_batches, steps=len(test_batches), verbose=1)
print("Model Accuracy on Test Data", acc[1]*100)
y = []
for x in range(0,len(test_batches)):
for i in range(0,len(test_batches[x][1])):
#print(test_batches[0][1][i])
y.append(np.argmax(test_batches[x][1][i]))
print(len(y))
con_mat = tf.math.confusion_matrix(labels=y, predictions=np.argmax(predictions,axis=1)).numpy()
print(con_mat)
Training/Validation培训/验证
Epoch 1/6
220/220 [==============================] - 2952s 13s/step - loss: 0.5842 - accuracy: 0.7912 - val_loss: 0.7926 - val_accuracy: 0.7988
Epoch 2/6
220/220 [==============================] - 2736s 12s/step - loss: 0.4041 - accuracy: 0.8723 - val_loss: 0.3094 - val_accuracy: 0.9023
Epoch 3/6
220/220 [==============================] - 2635s 12s/step - loss: 0.3718 - accuracy: 0.8804 - val_loss: 0.3871 - val_accuracy: 0.8906
Epoch 4/6
220/220 [==============================] - 2517s 11s/step - loss: 0.2904 - accuracy: 0.8980 - val_loss: 0.2863 - val_accuracy: 0.9160
Epoch 5/6
220/220 [==============================] - 2364s 11s/step - loss: 0.2779 - accuracy: 0.9057 - val_loss: 0.3500 - val_accuracy: 0.9238
Epoch 6/6
220/220 [==============================] - 2241s 10s/step - loss: 0.2839 - accuracy: 0.9068 - val_loss: 0.2202 - val_accuracy: 0.9355
<tensorflow.python.keras.callbacks.History at 0x7f6f8a59eb70>
Testing测试
WARNING:tensorflow:From <ipython-input-12-d213edec98d3>:2: Model.evaluate_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.evaluate, which supports generators.
63/63 [==============================] - 837s 13s/step - loss: 0.1519 - accuracy: 0.9410
Model Accuracy on Test Data 94.0999984741211
Confusion Matrix混淆矩阵
[[70 62 57 61]
[82 61 41 66]
[74 69 49 58]
[77 60 48 65]]
I know this is super old, but I just ran into a similar problem was frustrated to not find a answer here.我知道这已经很老了,但我刚刚遇到了类似的问题,因为在这里找不到答案而感到沮丧。 So here it goes:
所以这里是:
Setting shuffle = False
for the test_batches
ImageDataGenerator().flow_from_directory()
should solve the problem.为
test_batches
ImageDataGenerator().flow_from_directory()
设置shuffle = False
应该可以解决问题。
It seems that the data generator yields different batches when called twice.似乎数据生成器在调用两次时产生不同的批次。 First by your loop extracting the labels and than by
model.predict(test_batches)
basically making the label and predictions not match because the are for different batches.首先通过循环提取标签,然后通过
model.predict(test_batches)
基本上使 label 和预测不匹配,因为它们适用于不同的批次。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.