为图像分类模型绘制混淆矩阵

Question

I built an image classification CNN with keras.我用 keras 构建了一个图像分类 CNN。 While the model itself works fine (it is predicting properly on new data), I am having problems plotting the confusion matrix and classification report for the model.虽然模型本身工作正常（它对新数据进行了正确预测），但我在绘制模型的混淆矩阵和分类报告时遇到了问题。

I trained the model using ImageDataGenerator我使用 ImageDataGenerator 训练模型

train_path = '../DATASET/TRAIN'
test_path = '../DATASET/TEST'
IMG_BREDTH = 30
IMG_HEIGHT = 60
num_classes = 2

train_batch = ImageDataGenerator(featurewise_center=False,
                                 samplewise_center=False, 
                                 featurewise_std_normalization=False, 
                                 samplewise_std_normalization=False, 
                                 zca_whitening=False, 
                                 rotation_range=45, 
                                 width_shift_range=0.2, 
                                 height_shift_range=0.2, 
                                 horizontal_flip=True, 
                                 vertical_flip=False).flow_from_directory(train_path, 
                                                                          target_size=(IMG_HEIGHT, IMG_BREDTH), 
                                                                          classes=['O', 'R'], 
                                                                          batch_size=100)

test_batch = ImageDataGenerator().flow_from_directory(test_path, 
                                                      target_size=(IMG_HEIGHT, IMG_BREDTH), 
                                                      classes=['O', 'R'], 
                                                      batch_size=100)

This is the code for the confusion matrix and classification report这是混淆矩阵和分类报告的代码

batch_size = 100
target_names = ['O', 'R']
Y_pred = model.predict_generator(test_batch, 2513 // batch_size+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
cm = metrics.confusion_matrix(test_batch.classes, y_pred)
print(cm)
print('Classification Report')
print(metrics.classification_report(test_batch.classes, y_pred))

for the confusion matrix I get the rolling result (which seems to be wrong)对于混淆矩阵，我得到了滚动结果（这似乎是错误的）

Confusion Matrix
[[1401    0]
 [1112    0]]

The False positives and true positives are 0. For the classification report I get this following output and warning假阳性和真阳性为 0。对于分类报告，我得到以下输出和警告

Classification Report
             precision    recall  f1-score   support

          0       0.56      1.00      0.72      1401
          1       0.00      0.00      0.00      1112

avg / total       0.31      0.56      0.40      2513

/Users/sashaanksekar/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)

I am trying to predict if an object is organic or recyclable.我试图预测一个物体是有机的还是可回收的。 I have around 22000 train images and 2513 test images.我有大约 22000 张训练图像和 2513 张测试图像。

I am new to machine learning.我是机器学习的新手。 what am I doing wrong?我究竟做错了什么？

Thanks in advance提前致谢

Answer 1

To plot the confusion matrix do the following:要绘制混淆矩阵，请执行以下操作：

import matplotlib.pyplot as plt
import numpy as np

cm = metrics.confusion_matrix(test_batch.classes, y_pred)
# or
#cm = np.array([[1401,    0],[1112, 0]])

plt.imshow(cm, cmap=plt.cm.Blues)
plt.xlabel("Predicted labels")
plt.ylabel("True labels")
plt.xticks([], [])
plt.yticks([], [])
plt.title('Confusion matrix ')
plt.colorbar()
plt.show()

References:参考：

https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/

https://machinelearningmastery.com/confusion-matrix-machine-learning/ https://machinelearningmastery.com/confusion-matrix-machine-learning/

Answer 2

If someone got here like me because of similar issue there may be several things that could help:如果有人像我一样因为类似的问题来到这里，可能有几件事可以提供帮助：

Make sure you set shuffle = False in your test set generator;确保在测试集生成器中设置了shuffle = False ；
It's better to set the batch_size to a divisor of your image count.最好将batch_size设置为图像计数的除数。 If not - make sure the generator doesn't skip any images;如果没有 - 确保生成器不会跳过任何图像；
Try training without augmentation first;首先尝试没有增强的训练；
There seems to be an issue where output of the predict_generator is not consistent, try setting workers = 0 if possible, like this:似乎存在predict_generator输出不一致的问题，如果可能，请尝试设置workers = 0 ，如下所示：
predictions = model.predict_generator(testGenerator, steps = np.ceil(testGenerator.samples / testGenerator.batch_size), verbose=1, workers=0)

In my case the predictions changed each time I called predict_generator if I didn't do it.在我的情况下，如果我不这样做，每次我调用predict_generator时预测都会改变。

When you have only two classes you have to use:当您只有两个类时，您必须使用：
predictedClasses = np.where(predictions>0.5, 1, 0) instead of np.argmax(Y_pred, axis=1) since in this case np.argmax will always output 0. predictedClasses = np.where(predictions>0.5, 1, 0)而不是np.argmax(Y_pred, axis=1)因为在这种情况下np.argmax将始终输出 0。
np.where(predictions>0.5, 1, 0) returns 1 if prediction > 0.5 else returns 0. np.where(predictions>0.5, 1, 0)如果预测 > 0.5 则返回 1，否则返回 0。

Answer 3

I use sklearn plot_confusion_matrix我使用 sklearn plot_confusion_matrix

To use it I made a hack so when the sklearn estimator makes prediction dont complaints because is a Keras model.为了使用它，我做了一个 hack，所以当 sklearn 估计器进行预测时不要抱怨，因为它是一个 Keras 模型。 So, if model is a trained keras model:因此，如果模型是经过训练的 keras 模型：

X,y = test_generator.next()
y = np.argmax(y, axis=1)

from sklearn.metrics import plot_confusion_matrix
class newmodel(MLPClassifier):
    def __init__(self, model):
        self.model = model
    def predict(self, X):
        y = self.model.predict(X)
        return np.argmax(y,axis=1)

model1 = newmodel(model)
plot_confusion_matrix(model1, X, y , normalize='true', xticks_rotation = 'vertical', display_labels = list(train_generator.class_indices.keys()))

It works for me.这个对我有用。

为图像分类模型绘制混淆矩阵

问题描述

3 个解决方案

解决方案1
4 2018-07-11 10:04:41

解决方案2
3 已采纳 2019-05-05 01:59:06

解决方案3
1 2021-06-07 10:26:13

为图像分类模型绘制混淆矩阵

问题描述

3 个解决方案

解决方案1 4 2018-07-11 10:04:41

解决方案2 3 已采纳 2019-05-05 01:59:06

解决方案3 1 2021-06-07 10:26:13

解决方案1
4 2018-07-11 10:04:41

解决方案2
3 已采纳 2019-05-05 01:59:06

解决方案3
1 2021-06-07 10:26:13