简体   繁体   English

错误:分类指标无法处理多类多输出和多标记指标目标的混合

[英]Error: Classification metrics can't handle a mix of multiclass-multioutput and multilabel-indicator targets

I am newbie to machine learning in general. 我是机器学习的新手。

I am trying to do multilabel text classification. 我正在尝试做多标签文本分类。 I have the original labels for these documents as well as the result of the classification (used mlknn classifier) represented as one hot encoding (19000 document x 200 label). 我有这些文档的原始标签以及表示为一个热编码(19000文档x 200标签)的分类结果(使用的mlknn分类器)。 Now I am trying to evaluate the classification with f1_score micro and macro but I am getting this error (on line 3) ValueError: Classification metrics can't handle a mix of multiclass-multioutput and multilabel-indicator targets and I dont know how I can solve it. 现在我试图用f1_score微观和宏来评估分类,但是我得到了这个错误(第3行) ValueError: Classification metrics can't handle a mix of multiclass-multioutput and multilabel-indicator targets ,我不知道我怎么能解决这个问题。 This is my code: 这是我的代码:

1. y_true = np.loadtxt("target_matrix.txt")
2. y_pred = np.loadtxt("classification_results.txt")

3. print (f1_score(y_true, y_pred, average='macro'))
4. print (f1_score(y_true, y_pred, average='micro'))

I also tried to use cross_val_score for the classification to get the evaluation right away but ran into another error (from cross_val_score line): 我还尝试使用cross_val_score进行分类以立即获得评估但遇到另一个错误(来自cross_val_score行):

File "_csparsetools.pyx", line 20, in scipy.sparse._csparsetools.lil_get1
File "_csparsetools.pyx", line 48, in scipy.sparse._csparsetools.lil_get1
IndexError: column index (11) out of bounds

this is my code: 这是我的代码:

X = np.loadtxt("docvecs.txt", delimiter=",")
y = np.loadtxt("target_matrix.txt", dtype='int')

cv_scores = []
mlknn = MLkNN(k=10)  
scores = cross_val_score(mlknn, X, y, cv=5, scoring='f1_micro')
cv_scores.append(scores)

any help with either one of the errors is much appreciated, thanks. 任何一个错误的帮助非常感谢,谢谢。

Can you show the first couple elements of y? 你能展示y的前几个元素吗? Are you using scikit-multilearn? 你在使用scikit-multilearn吗? Also, if you can please use the 0.1.0 release candidate of scikit-multilearn, there second error is most likely a bug that was fixed in master, and a new version is planned for release in a couple of days. 另外,如果你可以请使用scikit-multilearn的0.1.0版本候选版本,那么第二个错误很可能是在master中修复的错误,并且计划在几天内发布新版本。

You can get the master via pip: pip uninstall -y scikit-multilearn pip install https://github.com/scikit-multilearn/scikit-multilearn/archive/master.zip 你可以通过pip获得master: pip uninstall -y scikit-multilearn pip install https://github.com/scikit-multilearn/scikit-multilearn/archive/master.zip

I was creating the y array manually and it seems that was my mistake. 我手动创建了y数组,这似乎是我的错误。 I used now MultiLabelBinarizer to create it, as the following example and now it works: 我现在使用MultiLabelBinarizer创建它,如下例所示,现在它可以工作:

train_foo = [['sci-fi', 'thriller'],['comedy'],['sci-fi', 'thriller'],['comedy']]
mlb = MultiLabelBinarizer()
mlb_label_train = mlb.fit_transform(train_foo)

X = np.loadtxt("docvecs.txt", delimiter=",")
cv_scores = []
mlknn = MLkNN(k=3) 
scores = cross_val_score(mlknn, X, mlb_label_train, cv=5, scoring='f1_macro')
cv_scores.append(scores)

you can find the documentation for MultiLabelBinarizer here . 你可以在这里找到MultiLabelBinarizer的文档。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 ValueError:分类指标无法处理多标签指标和连续多输出目标错误的混合 - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets error 如何处理 ValueError:分类指标无法处理多标签指标和多类目标错误的混合 - how to handle ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets error 混淆矩阵错误“分类指标无法处理多标签指标和多类目标的混合” - confusion matrix error "Classification metrics can't handle a mix of multilabel-indicator and multiclass targets" ValueError:分类指标无法处理多标签指标和连续多输出目标的混合 - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets ValueError:分类指标无法处理多标签指标和连续多输出目标 sklearn 的混合 - ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets sklearn ValueError:分类指标无法在 ROC 曲线计算中处理多类和多标签指标目标的混合 - ValueError: Classification metrics can't handle a mix of multiclass and multilabel-indicator targets in ROC curve calculation 如何修复 ValueError:分类指标无法处理模型的多类和多标签指标目标的混合? - How to fix ValueError: Classification metrics can't handle a mix of multiclass and multilabel-indicator targets for model? ValueError:分类指标无法处理多类和多标记指标目标的混合 - ValueError: Classification metrics can't handle a mix of multiclass and multilabel-indicator targets 分类指标无法处理多类和多标签指标目标的混合 - Classification metrics can't handle a mix of multiclass and multilabel-indicator targets f-score:ValueError:分类指标无法处理多标签指标和连续多输出目标的混合 - f-score: ValueError: Classification metrics can't handle a mix of multilabel-indicator and continuous-multioutput targets
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM