spaCy：如何为此使用一些已加载的 model 将命名实体写入现有文档 object？

Question

I created a Doc object from a custom list of tokens according to documentation like so:我根据文档从自定义令牌列表中创建了一个Doc object，如下所示：

import spacy
from spacy.tokens import Doc

nlp = spacy.load("my_ner_model")
doc = Doc(nlp.vocab, words=["Hello", ",", "world", "!"])

How do I write named entities tags to doc with my NER model now?我现在如何使用我的 NER model 将命名实体标签写入doc ？

I tried to do doc = nlp(doc) , but that didn't work for me raising a TypeError .我试图做doc = nlp(doc) ，但这对我提出TypeError不起作用。

I can't just join my list of words into a plain text to do doc = nlp(text) as usual because in this case spaCy splits some words in my texts into two tokens which I can not accept.我不能像往常一样将我的单词列表加入到纯文本中来执行doc = nlp(text) ，因为在这种情况下， spaCy将我文本中的一些单词分成两个我不能接受的标记。

Answer 1

You can get the NER component from your loaded model and call it directly on the constructed Doc :您可以从加载的 model 中获取 NER 组件，并直接在构造的Doc上调用它：

doc = nlp.get_pipe("ner")(doc)

You can inspect a list of all the available components in the pipeline with nlp.pipe_names and call them individually this way.您可以使用nlp.pipe_names检查管道中所有可用组件的列表，并以这种方式单独调用它们。 The tokenizer is always the first element of the pipeline when you call nlp() and it isn't included in this list, which only has the components that both take and return a Doc .当您调用nlp()时，tokenizer 始终是管道的第一个元素，并且它不包含在此列表中，该列表仅包含接受和返回Doc的组件。

spaCy：如何为此使用一些已加载的 model 将命名实体写入现有文档 object？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-10-13 14:33:32

spaCy：如何为此使用一些已加载的 model 将命名实体写入现有文档 object？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-10-13 14:33:32

解决方案1
1 已采纳 2019-10-13 14:33:32