简体繁体 English

使用 spacy python 的自定义 NER 需要多少训练数据（句子）？[只是粗略的想法]

[英]How many training data(sentences) are required for custom NER using spacy python?[Just rought idea]

原文 2019-12-26 12:25:00 9 2 python/ machine-learning/ spacy/ ner

I want to know let's say I have 10 custom entities to recognize how much annotated training sentences should I give (Any rough idea) ??我想知道假设我有 10 个自定义实体来识别我应该给出多少带注释的训练句子（任何粗略的想法）？

Thank You, in Advance!!先感谢您！！ :) :)

I am new to this, please help我是新手，请帮忙

2 个解决方案

For developing custom ner model at least 50-100 occurrences of each entity will be required along with their proper context.为了开发自定义的 ner 模型，每个实体至少需要 50-100 次出现以及它们的适当上下文。 Otherwise if you have less data than your custom model will overfit on that.否则，如果您的数据少于自定义模型，则会过度拟合。 So, depending upon your data you will require atleast 200 to 300 sentences.因此，根据您的数据，您将需要至少 200 到 300 个句子。

For the custom NER model from Spacy, you will definitely require around 100 samples for each entity that too without any biases in your dataset.对于来自 Spacy 的自定义 NER 模型，每个实体肯定需要大约 100 个样本，并且在你的数据集中也没有任何偏差。

All this is as per my experience.这一切都是根据我的经验。

Suggestion -: Spacy Custom model you can explore, but for production level or some good project, you can't be totally dependent on that only, You have to do some NLP/ Relation Extraction, etc. along with this.建议-：Spacy Custom 模型你可以探索，但是对于生产级别或一些好的项目，你不能完全依赖它，你必须同时做一些NLP/关系提取等。

Hope this helps.希望这可以帮助。

如何使用 SpaCy 更改自定义 NER model 再训练的训练数据格式？ - How to change the format of training data for custom NER model retraining using SpaCy?

使用 Spacy 进行 NER 训练 - NER training using Spacy

使用自定义数据训练 Spacy 的预定义 NER 模型，需要了解复合因子、批量大小和损失值 - Training predefined NER model of Spacy, with custom data, need idea about compound factor, batch size and loss values

如何使用 spacy3 在训练自定义 NER model 中提供 100 个注释文件的多个 - How can I feed multiple of 100 annotated files in training Custom NER model using spacy3

这个 for 循环如何在 Spacy 的自定义 NER 训练代码中工作？ - How does this for loop work in Spacy's custom NER training code?

在自定义数据集上训练 Spacy NER 出错 - Training Spacy NER on custom dataset gives error

SpaCy 自定义 NER 模型训练中“drop”的含义？ - Meaning of "drop" in SpaCy custom NER model training?

Spacy NER模型训练数据改进 - Spacy NER Model Training Data Improvement

在 Google Colab 上使用 spacy 训练 NER - Training NER using spacy on Google Colab

将 Spacy 训练数据格式转换为 Spacy CLI 格式（用于空白 NER） - Converting Spacy Training Data format to Spacy CLI Format (for blank NER)

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 SpaCy 更改自定义 NER model 再训练的训练数据格式？ - How to change the format of training data for custom NER model retraining using SpaCy? 使用 Spacy 进行 NER 训练 - NER training using Spacy 使用自定义数据训练 Spacy 的预定义 NER 模型，需要了解复合因子、批量大小和损失值 - Training predefined NER model of Spacy, with custom data, need idea about compound factor, batch size and loss values 如何使用 spacy3 在训练自定义 NER model 中提供 100 个注释文件的多个 - How can I feed multiple of 100 annotated files in training Custom NER model using spacy3 这个 for 循环如何在 Spacy 的自定义 NER 训练代码中工作？ - How does this for loop work in Spacy's custom NER training code? 在自定义数据集上训练 Spacy NER 出错 - Training Spacy NER on custom dataset gives error SpaCy 自定义 NER 模型训练中“drop”的含义？ - Meaning of "drop" in SpaCy custom NER model training? Spacy NER模型训练数据改进 - Spacy NER Model Training Data Improvement 在 Google Colab 上使用 spacy 训练 NER - Training NER using spacy on Google Colab 将 Spacy 训练数据格式转换为 Spacy CLI 格式（用于空白 NER） - Converting Spacy Training Data format to Spacy CLI Format (for blank NER)

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM