[英]Group by a list of tuples by certain indices
I have list A below我在下面有列表 A
A = [('a',(1,2,3)),
('b',(2,4,5)),
('c',(2,3,2)),
('d',(5,3,2))]
I would like to group by A based on second and third element of inner tuple.我想根据内部元组的第二个和第三个元素按 A 分组。 So, the desired output is因此,所需的 output 是
output = [[('a',(1,2,3))],
[('b',(2,4,5))],
[('c',(2,3,2)), ('d',(5,3,2))]]
I could achieve half of this by creating B out of A as follows and using itemgetter and groupby.我可以通过如下方式从 A 中创建 B 并使用 itemgetter 和 groupby 来实现一半。 But, this requires a remapping to include the first element of each outer tuple.但是,这需要重新映射以包含每个外部元组的第一个元素。 I thought there could be a more efficient way.我认为可能有更有效的方法。
from operator import itemgetter
from itertools import groupby
B = [i[1] for i in A]
semi_output = [list(g) for _,g in
groupby(B,itemgetter(1,2))]
I'm not sure about efficiency.but the below code will solve your problem.我不确定效率。但下面的代码将解决您的问题。
from collections import Counter
A = [('a', (1, 2, 3)),
('b', (2, 4, 5)),
('x', (4, 4, 5)),
('c', (2, 3, 2)),
('d', (5, 3, 2))]
test = []
for tup in A:
test.append(tup[1][1:]) # getting last 2 ele of 2nd tuple
duplicates = [k for k, v in Counter(test).items() if v > 1] # get duplicates
group_data = []
non_group_data = []
for dup in duplicates: # comparing duplicates with original elements
match = []
non_match = []
for tup in A:
if tup[1][1:] == dup:
match.append(tup)
else:
non_match.append(tup)
non_group_data.extend(non_match)
group_data.append(match)
duplicates = [[k for k, v in Counter(non_group_data).items() if v > 1]]
output = group_data + duplicates
print(output)
>> [[('b', (2, 4, 5)), ('x', (4, 4, 5))], [('c', (2, 3, 2)), ('d', (5, 3, 2))], [('a', (1, 2, 3))]]
If you first sort your list by the second and third elements of the inner tuple, groupby
will work as expected.如果您首先按内部元组的第二个和第三个元素对列表进行排序, groupby
将按预期工作。
a = [('a',(1,2,3)),
('b',(2,4,5)),
('c',(2,3,2)),
('d',(5,3,2))]
b = sorted(a, key=lambda x: (x[1][1], x[1][2]))
# [('a', (1, 2, 3)),
# ('c', (2, 3, 2)),
# ('d', (5, 3, 2)),
# ('b', (2, 4, 5))]
c = groupby(b, key=lambda x: (x[1][1], x[1][2]))
d = [list(x[1]) for x in c]
# [[('a', (1, 2, 3))],
# [('c', (2, 3, 2)), ('d', (5, 3, 2))],
# [('b', (2, 4, 5))]]
If you further want the results sorted by the first element in the tuple, that's trivial.如果您进一步希望结果按元组中的第一个元素排序,那很简单。
e = sorted(d, key=itemgetter(0))
# [[('a', (1, 2, 3))],
# [('b', (2, 4, 5))],
# [('c', (2, 3, 2)), ('d', (5, 3, 2))]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.