简体   繁体   English

如何从多类分类的混淆矩阵中提取假阳性,假阴性

[英]How to extract False Positive, False Negative from a confusion matrix of multiclass classification

I am classifying mnist data using following Keras code. 我正在使用以下Keras代码对mnist数据进行分类。 From confusion_matrix command of sklearn.metrics i got confusion matrix and from TruePositive= sum(numpy.diag(cm1)) command i am able to get True Positive. sklearn.metrics confusion_matrix命令我得到了混淆矩阵,从TruePositive= sum(numpy.diag(cm1))命令我得到了True Positive。 But i am confuse how to get True Negative , False Positive, False Negative. 但我很混淆如何得到真正的否定,误判,否定。 I read solution from here but user comments confuse me. 我从这里阅读解决方案,但用户评论让我困惑。 please help to code to get parameters. 请帮助编写代码以获取参数。

from sklearn.metrics import confusion_matrix
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
batch_size = 128
num_classes = 10
epochs = 1
img_rows, img_cols = 28, 28
y_test1=y_test

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
#model.add(GlobalAveragePooling2D())
#model.add(GlobalMaxPooling2D())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.binary_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])



model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

pre_cls=model.predict_classes(x_test)

cm1 = confusion_matrix(y_test1,pre_cls)
print('Confusion Matrix : \n', cm1)

TruePositive= sum(np.diag(cm1))

First of all, you have omissions in your code - in order to run, I needed to add the following commands: 首先,你的代码中有遗漏 - 为了运行,我需要添加以下命令:

import keras
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Having done that, and given the confusion matrix cm1 : 完成后,给出混淆矩阵cm1

array([[ 965,    0,    1,    0,    0,    2,    6,    1,    5,    0],
       [   0, 1113,    4,    2,    0,    0,    3,    0,   13,    0],
       [   8,    0,  963,   14,    5,    1,    7,    8,   21,    5],
       [   0,    0,    3,  978,    0,    7,    0,    6,   12,    4],
       [   1,    0,    4,    0,  922,    0,    9,    3,    3,   40],
       [   4,    1,    1,   27,    0,  824,    6,    1,   20,    8],
       [  11,    3,    1,    1,    5,    6,  925,    0,    6,    0],
       [   2,    6,   17,    8,    2,    0,    1,  961,    2,   29],
       [   5,    1,    2,   13,    4,    6,    2,    6,  929,    6],
       [   6,    5,    0,    7,    5,    6,    1,    6,   10,  963]])

here is how you can get the requested TP, FP, FN, TN per class : 以下是每个课程如何获得所要求的TP,FP,FN,TN:

The True Positives are simply the diagonal elements: 真正的正面只是对角元素:

TruePositive = np.diag(cm1)
TruePositive
# array([ 965, 1113,  963,  978,  922,  824,  925,  961,  929,  963])

The False Positives are the sum of the respective column, minus the diagonal element: 误报是相应列的总和,减去对角线元素:

FalsePositive = []
for i in range(num_classes):
    FalsePositive.append(sum(cm1[:,i]) - cm1[i,i])
FalsePositive
# [37, 16, 33, 72, 21, 28, 35, 31, 92, 92]

Similarly, the False Negatives are the sum of the respective row, minus the diagonal element: 类似地,假阴性是相应行的总和减去对角线元素:

FalseNegative = []
for i in range(num_classes):
    FalseNegative.append(sum(cm1[i,:]) - cm1[i,i])
FalseNegative
# [15, 22, 69, 32, 60, 68, 33, 67, 45, 46]

Now, the True Negatives are a little trickier; 现在,真正的否定者有点棘手; let's first think what exactly a True Negative means, with respect to, say class 0 : it means all the samples that have been correctly identified as not being 0 . 让我们首先想一下,真正的否定意味着什么,相对于0级来说:它意味着所有被正确识别为不为0的样本。 So, essentially what we should do is remove the corresponding row & column from the confusion matrix, and then sum up all the remaining elements: 所以,基本上我们应该做的是从混淆矩阵中删除相应的行和列,然后总结所有剩余的元素:

TrueNegative = []
for i in range(num_classes):
    temp = np.delete(cm1, i, 0)   # delete ith row
    temp = np.delete(temp, i, 1)  # delete ith column
    TrueNegative.append(sum(sum(temp)))
TrueNegative
# [8998, 8871, 9004, 8950, 9057, 9148, 9040, 9008, 8979, 8945]

Let's make a sanity check: for each class , the sum of TP, FP, FN, and TN must be equal to the size of our test set (here 10,000): let's confirm that this is indeed the case: 让我们进行一个健全性检查:对于每个类 ,TP,FP,FN和TN的总和必须等于我们的测试集的大小(这里是10,000):让我们确认确实如此:

l = len(y_test)
for i in range(num_classes):
    print(TruePositive[i] + FalsePositive[i] + FalseNegative[i] + TrueNegative[i] == l)

The result is 结果是

True
True
True
True
True
True
True
True
True
True

@desertnaut's answer is fantastic. @ desertnaut的回答太棒了。 I really like how he checks the answers. 我真的很喜欢他如何检查答案。

I get slightly different results for TrueNegatives. 我对TrueNegatives的结果略有不同。

He gets: 他得到:

TrueNegative
# [8998, 8871, 9004, 8950, 9057, 9148, 9040, 9008, 8979, 8945]

I get: 我明白了:

TrueNegative
# [8983, 8849, 8935, 8918, 8997, 9080, 9007, 8941, 8934, 8899]

I assume I am wrong, so I did some further checks. 我认为我错了,所以我做了一些进一步的检查。

for i in range(num_classes):
...     print(TruePositive[i] + FalsePositive[i] + FalseNegative[i] + 
TrueNegative[i])

10000
10000
10000
10000
10000
10000
10000
10000
10000
10000

The first and last results are easiest to double check: 第一个和最后一个结果最容易仔细检查:

>>> sum(sum(cm1[1:,1:]))
8983

>>> sum(sum(cm1[:9,:9]))
8899

I did wonder if a deepcopy was required before removing columns and rows but, as you may know, numpy.delete returns a new array. 我确实想知道在删除列和行之前是否需要深度复制,但是,正如您所知,numpy.delete返回一个新数组。

I started from scratch and reproduced my results for TrueNegatives. 我从零开始,为TrueNegatives复制了我的结果。 I think the TrueNegatives from the original post need to be investigated. 我认为原始帖子中的TrueNegatives需要进行调查。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何查看混淆矩阵中标记为假阳性和假阴性的行 - How to view the rows marked as False Positive and False Negative from confusion matrix 从混淆矩阵中获取假阴性、假阳性、真阳性和真阴性的相关数据集 - Getting relevant datasets of false negatives, false positives, true positive and true negative from confusion matrix 有没有办法用已知的真阳性、真阴性、假阳性和假阴性来绘制混淆矩阵? - Is there a way to draw confusion matrix with known True Positive, True Negative, False Positive and False Negative? 混淆矩阵中的误报率 - False Positive Rate in Confusion Matrix 如何计算Scikit中多类分类的混淆矩阵? - How compute confusion matrix for multiclass classification in Scikit? 用于多类分类的Tensorflow混淆矩阵 - Tensorflow confusion matrix for multiclass classification 如何计算误报率 (FPR) 和误报率百分比? - How to compute false positive rate (FPR) and False negative rate percantage? 在keras中生成混淆矩阵以进行多类分类 - generating confusion matrix in keras for multiclass classification Scikit-learn:如何获取True Positive、True Negative、False Positive和False Negative - Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative 从混淆矩阵计算真实正值以进行多类分类 - Computing true positive value from confusion matrix for multi class classification
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM