简体   繁体   English

如何使用混淆矩阵计算自定义训练的 spacy ner 模型的整体精度?

[英]How to calculate the overall accuracy of custom trained spacy ner model with confusion matrix?

I'm trying to evaluate my custom trained Spacy NER model.我正在尝试评估我的自定义训练的 Spacy NER 模型。 How to find the overall accuracy with confusion matrix for the model.如何使用模型的混淆矩阵找到整体精度。

I tried evaluating the model with spacy scorer which gives precision, recall and token accuracy with the below reference,我尝试使用 spacy scorer 评估模型,它通过以下参考提供精确度、召回率和标记准确度,

Evaluation in a Spacy NER model Spacy NER 模型中的评估

I expect the output in confusion matrix instead of individual precision, recall and token accuracy.我期望在混淆矩阵中输出,而不是单个精度、召回率和标记精度。

Here is a good read for creating Confusion Matrices for Spacy NER models. 是为 Spacy NER 模型创建混淆矩阵的好读物。 It is based on the BILOU format used by Spacy.它基于 Spacy 使用的 BILOU 格式。 It is good for small portions of text but when bigger documents are evaluated a Confusion Matrix is hard to read because most pieces of the text are O-labeled.它适用于小部分文本,但当评估较大的文档时,混淆矩阵很难阅读,因为大部分文本都是 O 标记的。

What you can do is create two lists, one with predicted values per word and one with the true values per word and compare those using the sklearn.metrics.confusion_matrix() function.您可以做的是创建两个列表,一个是每个单词的预测值,一个是每个单词的真实值,然后使用 sklearn.metrics.confusion_matrix() 函数比较它们。

from sklearn.metrics import confusion_matrix
y_true = [O,O,O,B-PER,I-PER]
y_pred = [O,O,O,B-PER,O]
confusion_matrix(y_true, y_pred, labels=["O", "B-PER", "I-PER"])

You can also use the plot_confusion_matrix() function from the same library to get a visual output, however this requires scikit-learn 0.23.1 or above and is only usable with scikit-learn classifiers.您还可以使用同一个库中的 plot_confusion_matrix() 函数来获得视觉输出,但这需要 scikit-learn 0.23.1 或更高版本,并且只能与 scikit-learn 分类器一起使用。

As written in this stackoverflow question, this is a way to use the confusion_matrix() from scikit-learn without their plot.正如在这个stackoverflow 问题中所写的那样,这是一种使用 scikit-learn 中的混淆矩阵() 的方法,而没有它们的情节。

from sklearn.metrics import confusion_matrix

labels = ['business', 'health']
cm = confusion_matrix(y_test, pred, labels)
print(cm)
fig = plt.figure()
ax = fig.add_subplot(111)
cax = ax.matshow(cm)
plt.title('Confusion matrix of the classifier')
fig.colorbar(cax)
ax.set_xticklabels([''] + labels)
ax.set_yticklabels([''] + labels)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spacy NER 自定义训练和预训练模型的置信度得分 - Confidence Score of Spacy NER custom trained and pretrained model 新训练的 spaCy NER 模型中没有 POS 标签,如何启用? - No POS tags in newly trained spaCy NER model, how to enable? spaCy v3 基于现有 model 训练 NER 或将自定义训练的 NER 添加到现有 model - spaCy v3 train NER based on existing model or add custom trained NER to existing model 预训练的 spacy 模型或 spacy.blank,对于自定义 NER 哪个是正确的方法? - pre-trained spacy model or spacy.blank,for custom NER which is the right way? 如何找到用于物体检测的预训练模型精度和混淆矩阵 - How to find pre-trained model accuracy and confusion matrix for object detection 自定义Spacy NER模型的总体F得分与单个实体F得分之间存在较大差异 - Large difference between overall F Score for a custom Spacy NER model and Individual Entity F Score 加载自定义训练的 spaCy 模型 - Load custom trained spaCy model 如何使用 Spacy NER 模型训练全新的实体而不是预训练的实体? - How to train completely new entities instead of pre-trained entities using Spacy NER model? spaCy 空白 NER model 欠拟合,即使在大型数据集上训练时也是如此 - spaCy blank NER model underfitting even when trained on a large dataset spaCy 2.0:保存并加载自定义NER模型 - spaCy 2.0: Save and Load a Custom NER model
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM