简体   繁体   English

使用itertools groupby创建列表列表

[英]Using itertools groupby to create a list of lists

I'm getting a list of items in the format parent.id_child.id , like 1_2 . 我得到的格式为parent.id_child.id的项目列表,如1_2 I tried to group the child's id by parent id, for example: from the input ['1_2', '2_2', '1_1', '2_1', '1_3'] I need the output [['1','2','3'], ['1','2']] . 我尝试按父ID对孩子的id进行分组,例如:从输入['1_2', '2_2', '1_1', '2_1', '1_3']我需要输出[['1','2','3'], ['1','2']] I have tried this: 我试过这个:

inputlist = ['1_2', '1_1', '2_1', '1_3', '2_2' ]
outputlist= [item.split('_') for item in inputlist]
outputlist.sort()
final = [list(group) for key, group in itertools.groupby(outputlist, lambda x: x[0])]

This groups the elements correctly, by I need to obtain only the second element of each item. 这正确地对元素进行分组,因为我只需要获得每个项目的第二个元素。 How can I achieve this? 我怎样才能做到这一点? Also, can I do the whole thing in a single list comprehension sentence? 另外,我可以在单个列表理解句中完成整个事情吗?

Use a list comprehension, yes; 使用列表理解,是的; the values themselves are passed into each group iterator unchanged so you need to select again: 值本身将被传递到每个组迭代器中,因此您需要再次选择:

final = [[g[1] for g in group] for key, group in itertools.groupby(outputlist, lambda x: x[0])]

You can do the whole thing in a single expression by nesting the splitting into the groupby call, but this becomes rather ugly fast, even when split across multiple lines: 您可以通过将拆分嵌套到groupby调用中来在单个表达式中完成整个操作,但这会变得相当丑陋,即使分割为多行:

final = [
    [g[1] for g in group]
    for key, group in itertools.groupby(
        sorted(item.split('_') for item in inputlist),
        lambda x: x[0])]

You could avoid sorting the whole input list and only sort the smaller groups by using a dictionary to do the grouping. 您可以避免对整个输入列表进行排序,并仅使用字典对较小的组进行排序以进行分组。 Dependending on the size of your ids, you may want to sort your ids numerically as well (since text sorting is done lexicographically): 根据您的ID的大小,您可能也想要以数字方式对您的ID进行排序(因为文本排序是按字典顺序完成的):

per_parent = {}
for item in inputlist:
    parent, child = item.split('_', 1)
    per_parent.setdefault(parent, []).append(child)
final = [children for parent, children in sorted(
    per_parent.items(), key=lambda pc: int(pc[0]))]

In Python 2, use iteritems() rather than items() to avoid building an intermediate list. 在Python 2中,使用iteritems()而不是items()来避免构建中间列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM