简体   繁体   English

使用 Spacy 进行 NER 训练

[英]NER training using Spacy

When running a train on an empty NER model, should I include only labeled data (data that contain necessarily at least one entity), or should I also include data that do not contain any label at all (in this case, teaching the model that in some circunstances these words do not have any label)?在空的 NER model 上运行火车时,我应该只包含标记数据(必须包含至少一个实体的数据),还是应该包含根本不包含任何 label 的数据(在这种情况下,教导 Z20F35E630DAF44DFACF533在某些情况下,这些词没有任何标签)?

If you look at the commonly used training data for NER (you can find links at http://nlpprogress.com/english/named_entity_recognition.html ), you'll see that most/every example has at least one entity.如果您查看 NER 的常用训练数据(您可以在http://nlpprogress.com/english/named_entity_recognition.html找到链接),您会发现大多数/每个示例都至少有一个实体。

Despite that, the model probably learns that most entity types don't show up in any given example.尽管如此,model 可能了解到大多数实体类型不会出现在任何给定的示例中。 But you can always try adding examples of true negatives and see if that helps但是您总是可以尝试添加真正的否定示例,看看是否有帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM