简体   繁体   English

NER 使用 Spacy model

[英]NER Using Spacy model

I continue to get the message that there are no NERs in my corpus.我继续收到我的语料库中没有 NER 的消息。 I am expecting that cats, dogs etc. will be identified as person.我期待猫、狗等将被识别为人。 Let me know how to fix it.让我知道如何解决它。

import numpy as np
import pandas as pd

import spacy
from spacy import displacy

nlp = spacy.load("en_core_web_sm")

corpus=['cats are selfish', 'it is raining cats and dogs', 'dogs do not like birds','i do not like rabbits','i have eaten frogs snakes and alligators']

for sent in corpus:
    sentence_nlp = nlp(sent)
    # print named entities in sentences
    print([(word, word.ent_type_) for word in sentence_nlp if word.ent_type_])
    # visualize named entities
    displacy.render(sentence_nlp, style='ent', jupyter=True)

The error I get is:我得到的错误是:

[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False)
[]
./NER_Spacy.py:19: UserWarning: [W006] No entities to visualize found in Doc object. If this is surprising to you, make sure the
Doc was processed using a model that supports named entity recognition, and check the `doc.ents` property manually if necessary
.
 displacy.render(sentence_nlp, style='ent', jupyter=False) ```

I am expecting that cats, dogs etc. will be identified as person我期待猫,狗等将被识别为人

You're not expecting the right thing then:) Spacy's models for NER are trained on different datasets depending on the language.那你就没有期待正确的事情了:) Spacy 的 NER 模型根据语言在不同的数据集上进行训练。 In the case of the model you're using see here: https://spacy.io/models/en#en_core_web_sm对于您使用的 model,请参见此处: https://spacy.io/models/en#en_core_web_sm

The dataset used to train the model you're using is called "Onto Notes 5" and that one doesn't consider cats and dogs as PERSON (as most people do).用于训练您正在使用的 model 的数据集称为“Onto Notes 5”,并且该数据集不会将猫和狗视为 PERSON(大多数人都这样做)。 If you want to get "cats" and "dogs" as entities, you need to train your own NER model with your own data.如果你想得到“猫”和“狗”作为实体,你需要用你自己的数据训练你自己的NER model。 For example you could label some data with the ANIMAL entity using regex rules with a list of pets of interest, and using that labelled dataset, you can fine tune the NER model to do what you want.例如,您可以使用带有感兴趣宠物列表的正则表达式规则使用 ANIMAL 实体 label 一些数据,并使用该标记数据集,您可以微调 NER model 来做您想做的事。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM