混淆矩阵（） | ValueError：分类指标无法处理多类和多类多输出目标的混合

Question

Definitely has been asked before, but I've not been successful at analysing other posts' solutions for my own instance of this problem.以前肯定有人问过，但我没有成功分析其他帖子的解决方案，以解决我自己的这个问题的实例。

I have many classification models I want to compare using confusion_matrix()我有许多分类模型我想使用confusion_matrix()进行比较

matrix = confusion_matrix(y_test, y_pred) # ERROR

>>> y_pred
[[2 2 2 ... 2 2 2]
 [2 2 2 ... 2 2 2]
 [2 2 2 ... 2 2 2]
 ...
 [3 3 2 ... 3 2 3]
 [2 2 2 ... 2 2 2]
 [3 3 3 ... 3 3 3]]

>>> y_pred.shape
(500, 256)

>>> y_test
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3]

>>> y_test.shape
(500, )

Error:错误：

ValueError: Classification metrics can't handle a mix of multiclass and multiclass-multioutput targets

When .flatten() is performed on y_pred - ie 1D array (500 * 256 = 128000):当.flatten()在y_pred上执行时 - 即一维数组（500 * 256 = 128000）：

ValueError: Found input variables with inconsistent numbers of samples: [500, 128000]

Answer 1

Confusion matrix works on the basis of comparision between each predicted value and actual value.混淆矩阵基于每个预测值与实际值之间的比较来工作。 It is impossible compare 1 with [2,2,2....2,2,2]不可能将1与[2,2,2....2,2,2]进行比较

In your case, your y_pred is 2d but your y_test is 1d, thats where the actual error came.在您的情况下，您的 y_pred 是 2d 但您的 y_test 是 1d，这就是实际错误出现的地方。 I believe that you have to choose the most common number in your predicted list.我相信你必须在你的预测列表中选择最常见的数字。 Like 2 from [2,2,2....2,2]像[2,2,2....2,2] 2的 2

So here is the solution:所以这里是解决方案：

from scipy import stats 
import numpy as np

#taking the most frequent element from the predicted list
y_pred_list = [int(stats.mode(arr)[0]) for arr in y_pred.tolist()] #convert to list

y_pred_array = np.array(y_pred_list)  #convert to 1D with same shape of y_test

print(y_pred_array.shape)

print(y_pred_array)

matrix = confusion_matrix(y_test, y_pred_array)

混淆矩阵（） | ValueError：分类指标无法处理多类和多类多输出目标的混合

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-03-31 00:14:35

混淆矩阵（） | ValueError：分类指标无法处理多类和多类多输出目标的混合

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-03-31 00:14:35

解决方案1
2 已采纳 2021-03-31 00:14:35