[英]spaCy: How to write named entities to an existing Doc object using some loaded model for this?
I created a Doc
object from a custom list of tokens according to documentation like so:我根据文档从自定义令牌列表中创建了一个
Doc
object,如下所示:
import spacy
from spacy.tokens import Doc
nlp = spacy.load("my_ner_model")
doc = Doc(nlp.vocab, words=["Hello", ",", "world", "!"])
How do I write named entities tags to doc
with my NER model now?我现在如何使用我的 NER model 将命名实体标签写入
doc
?
I tried to do doc = nlp(doc)
, but that didn't work for me raising a TypeError
.我试图做
doc = nlp(doc)
,但这对我提出TypeError
不起作用。
I can't just join my list of words into a plain text to do doc = nlp(text)
as usual because in this case spaCy
splits some words in my texts into two tokens which I can not accept.我不能像往常一样将我的单词列表加入到纯文本中来执行
doc = nlp(text)
,因为在这种情况下, spaCy
将我文本中的一些单词分成两个我不能接受的标记。
You can get the NER component from your loaded model and call it directly on the constructed Doc
:您可以从加载的 model 中获取 NER 组件,并直接在构造的
Doc
上调用它:
doc = nlp.get_pipe("ner")(doc)
You can inspect a list of all the available components in the pipeline with nlp.pipe_names
and call them individually this way.您可以使用
nlp.pipe_names
检查管道中所有可用组件的列表,并以这种方式单独调用它们。 The tokenizer is always the first element of the pipeline when you call nlp()
and it isn't included in this list, which only has the components that both take and return a Doc
.当您调用
nlp()
时,tokenizer 始终是管道的第一个元素,并且它不包含在此列表中,该列表仅包含接受和返回Doc
的组件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.