Python 中的数组 TP、TN、FP 和 FN

Question

My prediction results look like this我的预测结果是这样的

TestArray测试数组

[1,0,0,0,1,0,1,...,1,0,1,1],
[1,0,1,0,0,1,0,...,0,1,1,1],
[0,1,1,1,1,1,0,...,0,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],

PredictionArray预测数组

[1,0,0,0,0,1,1,...,1,0,1,1],
[1,0,1,1,1,1,0,...,1,0,0,1],
[0,1,0,1,0,0,0,...,1,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],

this is the size of the arrays that I have这是我拥有的数组的大小

TestArray.shape

Out[159]: (200, 24)

PredictionArray.shape

Out[159]: (200, 24)

I want to get TP, TN, FP and FN for these arrays我想为这些阵列获得 TP、TN、FP 和 FN

I tried this code我试过这个代码

cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)

but the results I got但我得到的结果

TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)

125 5 0 1

I checked the shape of cm我检查了厘米的形状

cm.shape

Out[168]: (17, 17)

125 + 5 + 0 + 1 = 131 and that does not equal the number of columns I have which is 200 125 + 5 + 0 + 1 = 131 这不等于我拥有的列数 200

I am expecting to have 200 as each cell in the array suppose to be TF, TN, FP, TP so the total should be 200我期望有 200，因为阵列中的每个单元格假设是 TF、TN、FP、TP，所以总数应该是 200

How to fix that?如何解决？

Here is an example of the problem这是问题的一个例子

import numpy as np
from sklearn.metrics import confusion_matrix


TestArray = np.array(
[
[1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1,1,0,0,1],
[0,1,1,0,1,0,0,1,0,0,0,1,0,1,0,1,1,0,1,1],
[1,0,1,1,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,1,1,1],
[0,0,0,0,1,1,0,1,1,0,0,1,0,1,1,0,1,1,1,1],
[1,0,0,1,1,1,0,1,1,0,1,0,0,1,1,0,0,1,0,0],
[1,1,1,0,0,1,0,0,1,1,0,1,0,1,1,1,1,1,0,1],
[0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,1,1],
[1,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,1,0,1,0],
[1,1,0,1,1,1,1,0,1,0,1,0,1,1,1,1,0,1,0,0]
])

TestArray.shape



PredictionArray = np.array(
[
[0,0,0,1,1,1,1,0,0,0,1,0,0,0,1,0,1,0,1,1],
[0,1,0,0,1,0,1,1,0,0,0,1,1,0,0,1,1,0,0,1],
[1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0],
[0,1,0,1,0,0,1,0,0,1,0,1,1,0,0,1,0,0,1,1],
[0,0,1,0,0,1,0,1,1,1,0,1,1,1,0,0,1,1,0,1],
[1,0,0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,1,1,0],
[1,1,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,1,0,1],
[0,0,0,0,0,0,0,1,1,0,1,0,0,1,0,1,1,0,1,1],
[1,0,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,1,0],
[1,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,0,0]
])

PredictionArray.shape

cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]

print(TN,FN,TP,FP)

The output is输出是

5 0 2 0

= 5+0+2+0 = 7 !! = 5+0+2+0 = 7 !!

There are 20 columns in the array and 10 rows数组中有 20 列和 10 行

but cm gives to total of 7!!但厘米给总共 7 ！

Answer 1

When using np.argmax the matrices that you input sklearn.metrics.confusion_matrix isn't binary anymore, as np.argmax returns the index of the first occuring maximum value.使用np.argmax ，您输入的矩阵sklearn.metrics.confusion_matrix不再是二进制的，因为np.argmax返回第一个出现的最大值的索引。 In this case along axis=1 .在这种情况下，沿axis=1 。

You don't get the good'ol true-positives / hits, true-negatives / correct-rejections, etc., when your prediction isn't binary.当您的预测不是二进制时，您不会得到好的真阳性/命中、真阴性/正确拒绝等。

You should find that sum(sum(cm)) indeed equals 200.您应该会发现sum(sum(cm))确实等于 200。

If each index of the arrays represents an individual prediction, ie you are trying to get TP/TN/FP/FN for a total of 200 ( 10 * 20 ) predictions with the outcome of either 0 or 1 for each prediction, then you can obtain TP/TN/FP/FN by flattening the arrays before parsing them to confusion_matrix .如果数组的每个索引代表一个单独的预测，即您试图获得 TP/TN/FP/FN 总共 200 ( 10 * 20 ) 个预测，每个预测的结果为0或1 ，那么您可以在将数组解析为confusion_matrix矩阵之前，通过展平数组来获得 TP/TN/FP/FN。 That is to say, you could reshape TestArray and PreditionArry to (200,) , eg:也就是说，您可以将TestArray和PreditionArry重塑为(200,) ，例如：

cm = confusion_matrix(TestArray.reshape(-1), PredictionArray.reshape(-1))

TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]

print(TN, FN, TP, FP, '=', TN + FN + TP + FP)

Which returns哪个返回

74 28 73 25 = 200

Python 中的数组 TP、TN、FP 和 FN

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-04-01 07:29:38

Python 中的数组 TP、TN、FP 和 FN

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-04-01 07:29:38

解决方案1
2 已采纳 2020-04-01 07:29:38