简体   繁体   English

如何将空间命名实体链接到嵌套字典中的文本?

[英]How to link the spacy named entities to the the text from a nested dictionary?

I have a list of dictionaries containing text:我有一个包含文本的字典列表:

list_dicts = [{'id': 1, 'text': 'hello my name is Carla'}, {'id': 2, 'text': 'hello my name is John' }]

I applied Spacy named entity recognition on the nested texts like so:我在嵌套文本上应用了 Spacy 命名实体识别,如下所示:

for d in list_dicts: 
    for k,v in d.items():
        if k=='text':
            doc = nlp(v) 
            for ent in doc.ents:
                print([ent.text, ent.label_]) 

The output is a printout of the named entity text and its corresponding label, for example:输出是命名实体文本及其相应标签的打印输出,例如:

    ['Bob', 'PERSON']
    ['John', 'PERSON']

I would like to add the named entities to their corresponding text in each nested dictionary,which would look like this:我想将命名实体添加到每个嵌套字典中的相应文本中,如下所示:

list_dicts = [{'id': 1, 'text': 'hello our names are Carla and Bob', 'entities':[['Carla', 'PERSON'], ['Bob':'PERSON']]}, {'id': 2, 'text': 'hello my name is John', 'entities': [['John', 'PERSON']] }]

As for now, I attempted to implement zip() as a method for linking the entities to the original text and later convert these to a new list of dictionaries, but it seems zip() does not work with the Spacy objects.至于现在,我尝试实现 zip() 作为将实体链接到原始文本的方法,然后将它们转换为新的字典列表,但似乎 zip() 不适用于 Spacy 对象。

Using dict.setdefault使用dict.setdefault

Ex:前任:

for d in list_dicts: 
    doc = nlp(d['text'])
    for ent in doc.ents:
        d.setdefault('entities', []).append([ent.text, ent.label_])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM