繁体   English   中英

Python - 从嵌套列表中合并元组列表

[英]Python - Merge list of tuples from nested list

我有要合并的元组列表。 下面的代码将属性与传递给“classified_text”的单个列表相结合,我如何为嵌套的元组列表迭代这个概念? 我尝试添加另一个 for 循环和 append 方法,但出现不同的错误。 有什么简单的方法可以做到这一点? 谢谢!

输入文本 1 - 工作:

classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list

输出文本 1 - 工作:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]

输入文本 2 - 不工作:带有元组的嵌套列表

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

代码:

from itertools import groupby
entity_extracted_words = []
for tag, chunk in groupby(classified_text, lambda x:x[1]):
    if tag != "O":
        info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
        entity_extracted_words.append(info_ner)

print('entity_extracted_words:\n', entity_extracted_words)

Out Text 2 - 试图得到这个结果:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')] 

错误:类型错误:并非所有参数都在字符串格式化期间转换

尝试这样的事情。 简单地for-loop sublist s for-loop ,组合成一个字符串并将它们添加到newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
                   [('some', 'O'), ('text', 'O'), ('here', 'O')],
                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

newlist = []
for sublist in classified_text:
    combined = []
    for chunk, tag in sublist:
        if tag == 'O':
            continue
        combined_tag = tag
        combined.append(chunk)

    # Append tag and string to list
    if combined:
        # If you wanted to space filled as in your example, you can use
        # the strings ljust method
        newlist.append((combined_tag.ljust(12), ' '.join(combined)))

print(newlist)

#[('PERSON      ', 'John Smith'),
# ('ORGANIZATION', 'University of ABC'),
# ('ORGANIZATION', 'University of CA')]

您可以首先将列表列表平展为一个列表:

flat_list = [item for sublist in classified_text for item in sublist]

并且该平面列表应该适用于您的原始代码。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM