[英]Using itertools groupby to create a list of lists
I'm getting a list of items in the format parent.id_child.id
, like 1_2
. 我得到的格式为
parent.id_child.id
的项目列表,如1_2
。 I tried to group the child's id by parent id, for example: from the input ['1_2', '2_2', '1_1', '2_1', '1_3']
I need the output [['1','2','3'], ['1','2']]
. 我尝试按父ID对孩子的id进行分组,例如:从输入
['1_2', '2_2', '1_1', '2_1', '1_3']
我需要输出[['1','2','3'], ['1','2']]
。 I have tried this: 我试过这个:
inputlist = ['1_2', '1_1', '2_1', '1_3', '2_2' ]
outputlist= [item.split('_') for item in inputlist]
outputlist.sort()
final = [list(group) for key, group in itertools.groupby(outputlist, lambda x: x[0])]
This groups the elements correctly, by I need to obtain only the second element of each item. 这正确地对元素进行分组,因为我只需要获得每个项目的第二个元素。 How can I achieve this?
我怎样才能做到这一点? Also, can I do the whole thing in a single list comprehension sentence?
另外,我可以在单个列表理解句中完成整个事情吗?
Use a list comprehension, yes; 使用列表理解,是的; the values themselves are passed into each group iterator unchanged so you need to select again:
值本身将被传递到每个组迭代器中,因此您需要再次选择:
final = [[g[1] for g in group] for key, group in itertools.groupby(outputlist, lambda x: x[0])]
You can do the whole thing in a single expression by nesting the splitting into the groupby
call, but this becomes rather ugly fast, even when split across multiple lines: 您可以通过将拆分嵌套到
groupby
调用中来在单个表达式中完成整个操作,但这会变得相当丑陋,即使分割为多行:
final = [
[g[1] for g in group]
for key, group in itertools.groupby(
sorted(item.split('_') for item in inputlist),
lambda x: x[0])]
You could avoid sorting the whole input list and only sort the smaller groups by using a dictionary to do the grouping. 您可以避免对整个输入列表进行排序,并仅使用字典对较小的组进行排序以进行分组。 Dependending on the size of your ids, you may want to sort your ids numerically as well (since text sorting is done lexicographically):
根据您的ID的大小,您可能也想要以数字方式对您的ID进行排序(因为文本排序是按字典顺序完成的):
per_parent = {}
for item in inputlist:
parent, child = item.split('_', 1)
per_parent.setdefault(parent, []).append(child)
final = [children for parent, children in sorted(
per_parent.items(), key=lambda pc: int(pc[0]))]
In Python 2, use iteritems()
rather than items()
to avoid building an intermediate list. 在Python 2中,使用
iteritems()
而不是items()
来避免构建中间列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.