繁体   English   中英

使用 SpaCy Displacy 可视化定制的 NER 标签

[英]Visualizing customized NER tags with SpaCy Displacy

我是 spaCy 和 Python 的新手,我想使用这个库来可视化一个 NER。 这是我找到的示例示例:

import spacy
from spacy import displacy

NER = spacy.load("en_core_web_sm")

raw_text="The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well."

text1= NER(raw_text)

displacy.render(text1,style="ent",jupyter=True)

可视化示例

但是,我已经有了自定义标签及其位置的列表:

 [812, 834, "POS"], [838, 853, "ORG"], [870, 888, "POS"], [892, 920, "ORG"], [925, 929, "ENGLEVEL"], [987, 1002, "SKILL"],...

我希望我的文本使用我自己的自定义标签和实体进行可视化,而不是 spaCy 的默认 NER 选项。 我怎样才能做到这一点?

您需要添加表示实体的字符跨度并将它们附加到您的 doc 对象。 像这样的东西:

import spacy
from spacy import displacy

nlp = spacy.blank('en')
raw_text = "The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well."
doc = nlp.make_doc(raw_text)
spans = [[812, 834, "POS"], [838, 853, "ORG"], [870, 888, "POS"], [892, 920, "ORG"], [925, 929, "ENGLEVEL"],
         [987, 1002, "SKILL"]]
ents = []
for span_start, span_end, label in spans:
    ent = doc.char_span(span_start, span_end, label=label)
    if ent is None:
        continue

    ents.append(ent)

doc.ents = ents
displacy.render(doc, style="ent", jupyter=True)

相应地更改您的raw_textspans 如果你给出的跨度开始或结束超过文本的长度doc.char_span()返回None所以你需要适当地处理它。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM