简体   繁体   English

训练MITIE ner模型的数据集

[英]Dataset to train MITIE ner model

Is there any existing dataset with tagged entities to train MITIE ner model? 是否存在带有标记实体的现有数据集来训练MITIE ner模型? I checked the link, https://github.com/mit-nlp/MITIE/blob/master/examples/python/train_ner.py which trains the model with just two samples. 我检查了链接https://github.com/mit-nlp/MITIE/blob/master/examples/python/train_ner.py ,该链接仅使用两个样本来训练模型。 Is there any existing dataset with tagged entities to train ? 是否有现有的带有标记实体的数据集要训练?

I've been looking for something like this, too. 我也一直在寻找这样的东西。 Simply for a "generic" (and hence not very useful) NLU backend. 仅用于“通用”(因此不太有用)的NLU后端。 The only thing I've found so far is a trained model with 9 news categories (not very generic). 到目前为止,我发现的唯一一件事是经过训练的模型,其中包含9个新闻类别(不是很通用)。 See blog post here: http://eric-yuan.me/ner_1/ 在此处查看博客文章: http : //eric-yuan.me/ner_1/

If you have the option to switch NERs, spaCy has a trained model available by default. 如果您可以选择切换NER,则spaCy默认情况下会提供经过训练的模型。 Its visualisation front end can be found by google "displacy" 可以通过Google“ displacy”找到其可视化前端

If you find anything else, let me know! 如果还有其他问题,请告诉我!

EDIT: Spent the day looking into this and I think I've found what you're after. 编辑:花了一天的时间研究此事,我想我已经找到了你想要的。 If you go to https://github.com/mit-nlp/MITIE/releases there you'll find MITIE's own NER model trained on Wikipedia, Freebase, etc. The actual training dataset is there too. 如果您访问https://github.com/mit-nlp/MITIE/releases ,您将发现MITIE在Wikipedia,Freebase等上训练的NER模型。实际的训练数据集也在那里。 The README on their github page provides example on how to use the pre-trained model. 他们github页面上的README提供了有关如何使用预训练模型的示例。 You can also investigate the ner.py file in the examples folder to see how to use the pre-trained model in python code. 您还可以在examples文件夹中研究ner.py文件,以了解如何在python代码中使用经过预先训练的模型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM