ValueError：分类指标无法处理多类和多标记指标目标的混合

Question

I have Multi class labeled text classification problem with 2000 different labels. 我有多个标记的文本分类问题，有2000个不同的标签。 Doing classification using LSTM with Glove Embedding. 使用LSTM和Glove Embedding进行分类。

Label Encoder of target variable 标签目标变量的编码器
LSTM layer with Embedd Layer 带有嵌入层的LSTM层
Error metric is F2 score 错误指标是F2分数

LabelEncoded target variable: LabelEncoded目标变量：

le = LabelEncoder()  
le.fit(y)
train_y = le.transform(y_train)
test_y = le.transform(y_test)

LSTM network is like below with Glove Embeddings LSTM网络就像下面的Glove Embeddings

np.random.seed(seed)
K.clear_session()
model = Sequential()
model.add(Embedding(max_features, embed_dim, input_length = X_train.shape[1],
         weights=[embedding_matrix]))#,trainable=False
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(num_classes, activation='softmax'))
model.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy')
print(model.summary())

My error metric is F1 score. 我的错误指标是F1得分。 I build below function for Error metric 我为Error指标构建了以下函数

ValueError: Classification metrics can't handle a mix of multiclass and continuous-multioutput targets

Model fit is 模型适合

 model.fit(X_train, train_y, validation_data=(X_test, test_y),epochs=10, batch_size=64, callbacks=[metrics])

Getting below error after 1st epoch: 在第1纪元后获得以下错误：

 ValueError: Classification metrics can't handle a mix of multiclass and continuous-multioutput targets

Can you please tell me where I did mistake in my code? 你能告诉我在我的代码中哪里出错吗？

I tried a lot to resolve my self, but I did not get any clue. 我尝试了很多来解决我的自我，但我没有得到任何线索。 Can you please help me on this 你能帮我解决这个问题吗？

Answer 1

F1 score, recall and precision are metrics for binary classification for using it in a multiclass/multilabel problem you need to add a parameter to your function f1_score , recall_score and precision_score . F1得分，回忆和精确度是二元分类的度量标准，用于在多类/多标记问题中使用它，您需要在函数f1_score ， recall_score和precision_score添加参数。

Try with this : 试试这个：

_val_f1 = f1_score(val_targ, val_predict, average='weighted')
_val_recall = recall_score(val_targ, val_predict, average='weighted')
_val_precision = precision_score(val_targ, val_predict, average='weighted')

Find more information for the average parameter here : 在此处查找有关平均参数的更多信息：
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html

Answer 2

Your problem is caused by presence of continuous values in val_predict in this line of code 您的问题是由此代码行中val_predict中存在连续值引起的

_val_f1 = f1_score(val_targ, val_predict)

You should round your predictions in val_predict before calculating f1_score. 在计算f1_score之前，您应该在val_predict中对预测进行舍入。

Example solution: 示例解决方案

 _val_f1 = f1_score(val_targ,np.round(val_predict))

Want to mention: If you want to change the threshold of the round function (0.5 default.) you can add or subtract values in [0,1] interval: 想提及：如果要更改圆函数的阈值（默认值为0.5），可以在[0,1]区间中添加或减去值：

>>> a = np.arange(0,1,0.1)
>>> print(a, abs(np.round(a-0.1)), sep='\n')
>>> print(a, abs(np.round(a+0.3)), sep='\n')

[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]
array([0.  0.  0.  0.  0.  0.  1.  1.  1.  1.])

[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]
array([0., 0., 0., 1., 1., 1., 1., 1., 1., 1.])

Hope that helps! 希望有所帮助！

ValueError：分类指标无法处理多类和多标记指标目标的混合

问题描述

LabelEncoded target variable: LabelEncoded目标变量：

LSTM network is like below with Glove Embeddings LSTM网络就像下面的Glove Embeddings

Model fit is 模型适合

I tried a lot to resolve my self, but I did not get any clue. 我尝试了很多来解决我的自我，但我没有得到任何线索。 Can you please help me on this 你能帮我解决这个问题吗？

2 个解决方案

解决方案1
0 2019-06-07 15:16:55

解决方案2
0 2019-06-09 09:00:53

ValueError：分类指标无法处理多类和多标记指标目标的混合

问题描述

LabelEncoded target variable: LabelEncoded目标变量：

LSTM network is like below with Glove Embeddings LSTM网络就像下面的Glove Embeddings

Model fit is 模型适合

I tried a lot to resolve my self, but I did not get any clue. 我尝试了很多来解决我的自我，但我没有得到任何线索。 Can you please help me on this 你能帮我解决这个问题吗？

2 个解决方案

解决方案1 0 2019-06-07 15:16:55

解决方案2 0 2019-06-09 09:00:53

解决方案1
0 2019-06-07 15:16:55

解决方案2
0 2019-06-09 09:00:53