简体   繁体   English

Python - 从嵌套列表中合并元组列表

[英]Python - Merge list of tuples from nested list

I have list of list of tuples that I want to merge.我有要合并的元组列表。 Below code combines the properties with single list passed into 'classified_text', how do I iterate this concept for nested list of tuples?下面的代码将属性与传递给“classified_text”的单个列表相结合,我如何为嵌套的元组列表迭代这个概念? I tried adding another for loop and append method, but I get different error.我尝试添加另一个 for 循环和 append 方法,但出现不同的错误。 Any simple way to do this?有什么简单的方法可以做到这一点? Thanks!谢谢!

Input Text 1 - Working:输入文本 1 - 工作:

classified_text = [('John', 'PERSON'), ('Smith', 'PERSON'),('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')] # Single list

Output Text 1 - Working:输出文本 1 - 工作:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC')]

Input Text 2 - Not Working: Nested list with tuples输入文本 2 - 不工作:带有元组的嵌套列表

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')], [('some', 'O'), ('text', 'O'), ('here', 'O')], [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

Code:代码:

from itertools import groupby
entity_extracted_words = []
for tag, chunk in groupby(classified_text, lambda x:x[1]):
    if tag != "O":
        info_ner = "%-12s"%tag, " ".join(w for w, t in chunk)
        entity_extracted_words.append(info_ner)

print('entity_extracted_words:\n', entity_extracted_words)

Out Text 2 - Trying to get this result: Out Text 2 - 试图得到这个结果:

[('PERSON      ', 'John Smith'), ('ORGANIZATION', 'University of ABC'),('ORGANIZATION', 'University of CA')] 

Error: TypeError: not all arguments converted during string formatting错误:类型错误:并非所有参数都在字符串格式化期间转换

Try something like this.尝试这样的事情。 Simply for-loop over the sublist s, combining into a string and add them to the newlist简单地for-loop sublist s for-loop ,组合成一个字符串并将它们添加到newlist

classified_text = [[('John', 'PERSON'), ('Smith', 'PERSON')], 
                   [('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('ABC', 'ORGANIZATION')],
                   [('some', 'O'), ('text', 'O'), ('here', 'O')],
                   [('Mark', 'O'), ('from', 'O'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('CA', 'ORGANIZATION')]]

newlist = []
for sublist in classified_text:
    combined = []
    for chunk, tag in sublist:
        if tag == 'O':
            continue
        combined_tag = tag
        combined.append(chunk)

    # Append tag and string to list
    if combined:
        # If you wanted to space filled as in your example, you can use
        # the strings ljust method
        newlist.append((combined_tag.ljust(12), ' '.join(combined)))

print(newlist)

#[('PERSON      ', 'John Smith'),
# ('ORGANIZATION', 'University of ABC'),
# ('ORGANIZATION', 'University of CA')]

You could first flatten your list of lists into just a list:您可以首先将列表列表平展为一个列表:

flat_list = [item for sublist in classified_text for item in sublist]

And that flat list should work with your original code.并且该平面列表应该适用于您的原始代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM