根据条件过滤元组列表

Question

For a given list of tuples, if multiple tuples in the list have the first element of tuple the same - among them select only the tuple with the maximum last element.对于给定的元组列表，如果列表中的多个元组具有相同的元组的第一个元素 - 其中仅选择具有最大最后一个元素的元组。

For example:例如：

sample_list = [(5,16,2),(5,10,3),(5,8,1),(21,24,1)]

In the sample_list above since the first 3 tuples has the similar first element 5 in this case among them only the 2nd tuple should be retained since it has the max last element => 3 .在上面的sample_list ，由于前 3 个元组具有相似的第一个元素5在这种情况下，其中只有第二个元组应保留，因为它具有最大的最后一个元素 => 3 。

Expected op:预期操作：

op = [(5,10,3),(21,24,1)]

Code:代码：

op = []
for m in range(len(sample_list)):
    li = [sample_list[m]]
    for n in range(len(sample_list)):
        if(sample_list[m][0] == sample_list[n][0]
           and sample_list[m][2] != sample_list[n][2]):
            li.append(sample_list[n])
    op.append(sorted(li,key=lambda dd:dd[2],reverse=True)[0])

print (list(set(op)))

This works.这有效。 But it is very slow for long list.但是对于长列表来说非常慢。 Is there a more pythonic or efficient way to do this?有没有更pythonic或更有效的方法来做到这一点？

Answer 1

You could use a collections.defaultdict to group tuples that have the same first element and then take the maximum of each group based on the third:您可以使用collections.defaultdict对具有相同第一个元素的元组进行分组，然后根据第三个元素取每个组的最大值：

from collections import defaultdict

sample_list = [(5,16,2),(5,10,3),(5,8,1),(21,24,1)]

d = defaultdict(list)
for e in sample_list:
    d[e[0]].append(e)

res = [max(val, key=lambda x: x[2]) for val in d.values()]
print(res)

Output输出

[(5, 10, 3), (21, 24, 1)]

This approach is O(n) .这种方法是O(n) 。

Answer 2

Use itertools.groupby and operator.itemgetter for readability.使用itertools.groupby和operator.itemgetter以提高可读性。 Within the groups, apply max with an appropriate key function, again using itemgetter for brevity:在组内，使用适当的键函数应用max ，再次使用itemgetter为简洁起见：

from itertools import groupby
from operator import itemgetter as ig

lst = [(5, 10, 3), (21, 24, 1), (5, 8, 1), (5, 16, 2)]

[max(g, key=ig(-1)) for _, g in groupby(sorted(lst), key=ig(0))]
# [(5, 10, 3), (21, 24, 1)]

For a linear-time solution, with extra-space only bound the number of unique first elements, you may use a dict :对于线性时间解决方案，额外空间仅限制唯一第一个元素的数量，您可以使用dict ：

d = {}
for tpl in lst:
    first, *_, last = tpl
    if first not in d or last > d[first][-1]:
        d[first] = tpl

[*d.values()]
# [(5, 10, 3), (21, 24, 1)]

Answer 3

Try itertools.groupby :试试itertools.groupby ：

from itertools import groupby
sample_list.sort()
print([max(l, key=lambda x: x[-1]) for _, l in groupby(sample_list, key=lambda x: x[0])])

Or also with operator.itemgetter :或者也可以使用operator.itemgetter ：

from itertools import groupby
from operator import itemgetter
sample_list.sort()
print([max(l, key=itemgetter(-1)) for _, l in groupby(sample_list, key=itemgetter(0))])

For performance try:对于性能尝试：

from operator import itemgetter
dct = {}
for i in sample_list:
    if i[0] in dct:
        dct[i[0]].append(i)
    else:
        dct[i[0]] = [i]
print([max(v, key=itemgetter(-1)) for v in dct.values()])

All output:所有输出：

[(5, 10, 3), (21, 24, 1)]

Answer 4

Here is a linear-time method which I think qualifies as more Pythonic:这是一个线性时间方法，我认为它更像 Pythonic：

highest = dict()
for a, b, c in sample_list:
     if a not in highest or c >= highest[a][2]:
         highest[a] = (a, b, c)
op = list(highest.values())

You can change the >= to > if you care about how to choose between triples with the same first and last elements but different middle elements.您可以更改>=到>如果你关心如何使用相同的第一个和最后一个元素，但不同的中间分子三元之间进行选择。

As pointed out by @AlexWaygood, dict s have yielded their elements according to insertion order since Python 3.7.正如@AlexWaygood 所指出的，自 Python 3.7 以来， dict已经根据插入顺序生成了它们的元素。 The code above therefore causes the elements of op to be in the same order the elements of sample_list .因此，上面的代码导致op的元素与sample_list的元素具有相同的顺序。

In Python 3.6 or older, on the other hand, the order may change.另一方面，在 Python 3.6 或更早版本中，顺序可能会改变。 If you want a solution that works in Python 3.6 too, you will need to use an OrderedDict , as in:如果您想要一个也适用于 Python 3.6 的解决方案，您将需要使用OrderedDict ，如下所示：

from collections import OrderedDict

highest = OrderedDict()
for a, b, c in sample_list:
     if a not in highest or c >= highest[a][2]:
         highest[a] = (a, b, c)
op = list(highest.values())

根据条件过滤元组列表

问题描述

4 个解决方案

解决方案1
23 已采纳 2021-09-02 06:31:11

解决方案2
16 2021-09-02 06:32:13

解决方案3
5 2021-09-02 06:29:26

解决方案4
1 2021-09-09 10:04:02

根据条件过滤元组列表

问题描述

4 个解决方案

解决方案1 23 已采纳 2021-09-02 06:31:11

解决方案2 16 2021-09-02 06:32:13

解决方案3 5 2021-09-02 06:29:26

解决方案4 1 2021-09-09 10:04:02

解决方案1
23 已采纳 2021-09-02 06:31:11

解决方案2
16 2021-09-02 06:32:13

解决方案3
5 2021-09-02 06:29:26

解决方案4
1 2021-09-09 10:04:02