簡體   English   中英

使用 SpaCy Displacy 可視化定制的 NER 標簽

[英]Visualizing customized NER tags with SpaCy Displacy

我是 spaCy 和 Python 的新手,我想使用這個庫來可視化一個 NER。 這是我找到的示例示例:

import spacy
from spacy import displacy

NER = spacy.load("en_core_web_sm")

raw_text="The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well."

text1= NER(raw_text)

displacy.render(text1,style="ent",jupyter=True)

可視化示例

但是,我已經有了自定義標簽及其位置的列表:

 [812, 834, "POS"], [838, 853, "ORG"], [870, 888, "POS"], [892, 920, "ORG"], [925, 929, "ENGLEVEL"], [987, 1002, "SKILL"],...

我希望我的文本使用我自己的自定義標簽和實體進行可視化,而不是 spaCy 的默認 NER 選項。 我怎樣才能做到這一點?

您需要添加表示實體的字符跨度並將它們附加到您的 doc 對象。 像這樣的東西:

import spacy
from spacy import displacy

nlp = spacy.blank('en')
raw_text = "The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well."
doc = nlp.make_doc(raw_text)
spans = [[812, 834, "POS"], [838, 853, "ORG"], [870, 888, "POS"], [892, 920, "ORG"], [925, 929, "ENGLEVEL"],
         [987, 1002, "SKILL"]]
ents = []
for span_start, span_end, label in spans:
    ent = doc.char_span(span_start, span_end, label=label)
    if ent is None:
        continue

    ents.append(ent)

doc.ents = ents
displacy.render(doc, style="ent", jupyter=True)

相應地更改您的raw_textspans 如果你給出的跨度開始或結束超過文本的長度doc.char_span()返回None所以你需要適當地處理它。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM