I have a csv file with true and predicted labels (4 classes) associated with an ID. The csv file looks like this:
task_id,labels_true,labels_pred
76017-126511-18,2,2
76017-126512-18,0,3
76017-126513-18,2,2
76018-126511-18,2,2
76018-126512-18,2,2
76018-126513-18,2,1
76019-126511-18,2,2
76019-126512-18,1,0
I am using the confusion matrix from sklearn.metrics
y_true = df["labels_true"]
y_pred = df["labels_pred"]
cnf_matrix = confusion_matrix(y_true, y_pred, labels=[0,1,2,3])
It returns an array as follows:
[[ 554 1 28 0]
[ 15 1375 43 0]
[ 42 476 2263 0]
[ 0 0 0 0]]
My aim is to return a list with each element ID associated with the respective tp, tn, fp, fn values like this:
task_id,labels_true,labels_pred, cm
76017-126511-18,2,2, tp
76017-126513-18,2,2, tp
76018-126511-18,2,2, tp
It's a multi class confusion matrix. True/False positives are used for binary classification problems. What you can do is to ecncode your labels as a binary values for example (classes 1,2,3 encoded as 1) and recalculate the confusion matrix.
TL;DR : For multi-class cases, this is not possible.
As already suggested, the very notions of True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) come from binary classification settings; they can indeed be used in multi-class classification, as shown here , but in such cases the notions are not a straightforward extension of the binary case, making what you ask here actually impossible.
In multi-class classification, all these notions are defined and calculated per class . And this renders any effort to uniquely identify a sample as being in one and only one of these categories (TP, FP, TN, FN) impossible.
Let's demonstrate this with some examples, using your case (4 classes [0, 1, 2, 3]
).
Take a misclassified sample first, eg:
True label: 0
Predicted label: 3
0
, this is a False Negative (FN): prediction is not 0
, as it should be1
, this is a True Negative: it is not 1
, and it has correctly been classified as not 1
2
, this is again a True Negative (TN): it is not 2
, and it has correctly been classified as not 2
3
, this is a False Positive (FP): it has been wrongly classified as 3
without being soSimilar is the case for a correct classification, say
True label: 2
Predicted label: 2
0
, this is a True Negative (TN): it is not 0
, and it has correctly been classified as not 0
1
, this is a True Negative (TN): it is not 1
, and it has correctly been classified as not 1
2
, this is a True Positive (TP)3
, this is a True Negative (TN): it is not 3
, and it has correctly been classified as not 3
Given this exposition, it should hopefully be clear that what you ask is actually not possible in the multi-class case.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.