简体   繁体   English

按某些索引按元组列表分组

[英]Group by a list of tuples by certain indices

I have list A below我在下面有列表 A

A = [('a',(1,2,3)),
     ('b',(2,4,5)),
     ('c',(2,3,2)),
     ('d',(5,3,2))]

I would like to group by A based on second and third element of inner tuple.我想根据内部元组的第二个和第三个元素按 A 分组。 So, the desired output is因此,所需的 output 是

output = [[('a',(1,2,3))],
          [('b',(2,4,5))],
          [('c',(2,3,2)), ('d',(5,3,2))]]

I could achieve half of this by creating B out of A as follows and using itemgetter and groupby.我可以通过如下方式从 A 中创建 B 并使用 itemgetter 和 groupby 来实现一半。 But, this requires a remapping to include the first element of each outer tuple.但是,这需要重新映射以包含每个外部元组的第一个元素。 I thought there could be a more efficient way.我认为可能有更有效的方法。

from operator import itemgetter
from itertools import groupby

B = [i[1] for i in A]

semi_output = [list(g) for _,g in 
                groupby(B,itemgetter(1,2))]

I'm not sure about efficiency.but the below code will solve your problem.我不确定效率。但下面的代码将解决您的问题。

from collections import Counter

A = [('a', (1, 2, 3)),
     ('b', (2, 4, 5)),
     ('x', (4, 4, 5)),
     ('c', (2, 3, 2)),
     ('d', (5, 3, 2))]
test = []
for tup in A:
    test.append(tup[1][1:])  # getting last 2 ele of 2nd tuple
duplicates = [k for k, v in Counter(test).items() if v > 1]  # get duplicates
group_data = []
non_group_data = []
for dup in duplicates:  # comparing duplicates with original elements
    match = []
    non_match = []
    for tup in A:
        if tup[1][1:] == dup:
            match.append(tup)
        else:
            non_match.append(tup)
    non_group_data.extend(non_match)
    group_data.append(match)
duplicates = [[k for k, v in Counter(non_group_data).items() if v > 1]]
output = group_data + duplicates
print(output)

>> [[('b', (2, 4, 5)), ('x', (4, 4, 5))], [('c', (2, 3, 2)), ('d', (5, 3, 2))], [('a', (1, 2, 3))]]

If you first sort your list by the second and third elements of the inner tuple, groupby will work as expected.如果您首先按内部元组的第二个和第三个元素对列表进行排序, groupby将按预期工作。

a = [('a',(1,2,3)),
     ('b',(2,4,5)),
     ('c',(2,3,2)),
     ('d',(5,3,2))]

b = sorted(a, key=lambda x: (x[1][1], x[1][2]))
# [('a', (1, 2, 3)), 
#  ('c', (2, 3, 2)), 
#  ('d', (5, 3, 2)), 
#  ('b', (2, 4, 5))]

c = groupby(b, key=lambda x: (x[1][1], x[1][2]))

d = [list(x[1]) for x in c]
# [[('a', (1, 2, 3))], 
#  [('c', (2, 3, 2)), ('d', (5, 3, 2))], 
#  [('b', (2, 4, 5))]]

If you further want the results sorted by the first element in the tuple, that's trivial.如果您进一步希望结果按元组中的第一个元素排序,那很简单。

e = sorted(d, key=itemgetter(0))
# [[('a', (1, 2, 3))], 
#  [('b', (2, 4, 5))], 
#  [('c', (2, 3, 2)), ('d', (5, 3, 2))]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM