根据最大值删除元组列表中的重复项

Question

Suppose I have a list of tuples like this: 假设我有一个像这样的tuples列表：

[('Machine1', 88), ('Machine2', 90), ('Machine3', 78), ('Machine1', 90), ('Machine3', 95)]

And I want to filter the list such that I only have the highest values pertaining to each tuple pairing. 我想过滤列表，这样我就只有与每个tuple配对有关的最高值。 So in this example the filtered list would be: 因此，在此示例中，过滤后的列表为：

[('Machine2', 90),('Machine1', 90), ('Machine3', 95)]

I basically want to remove duplicates by the highest value. 我基本上想按最高值删除重复项。 I know set only removes exact duplicates so I won't be able to do that here. 我知道set只删除确切的重复项，因此我在这里无法做到这一点。 I thought another method I could use would be to use a dictionary and update it while iterating through the list if a higher value was seen. 我认为我可以使用的另一种方法是使用dictionary并在迭代列表时更新它（如果看到更高的值）。 However, what is a more pythonic way to approach this? 但是，有什么更Python的方式来解决这个问题？

Answer 1

One solution with simple dict 一种简单的dict解决方案

d = {}
for machine, value in l:
    d[machine] = max(d.get(machine, -float('inf')), value)
print(list(d.items()))

Outputs 产出

[('Machine1', 90), ('Machine2', 90), ('Machine3', 95)]

Using pandas (for fun) 使用pandas （好玩）

>>> pd.DataFrame(l).groupby(0).max().to_dict()[1].items()
[('Machine1', 90), ('Machine2', 90), ('Machine3', 95)]

Answer 2

Here's one solution using collections.defaultdict . 这是一个使用collections.defaultdict的解决方案。 The idea is to iterate your list of tuples and append to lists. 这个想法是迭代您的元组列表并追加到列表中。 Then use zip with map + max to create the desired result. 然后将zip与map + max以创建所需的结果。

from collections import defaultdict

L = [('Machine1', 88), ('Machine2', 90), ('Machine3', 78),
     ('Machine1', 90), ('Machine3', 95)]

d = defaultdict(list)

for name, num in L:
    d[name].append(num)

res =  list(zip(d, map(max, d.values())))

Result 结果

[('Machine1', 90), ('Machine2', 90), ('Machine3', 95)]

Answer 3

It may be possible to use the groupby operator in itertools: 在itertools中可能使用groupby运算符：

>>> import itertools as it
>>> [ (k, max( list(zip(*g))[1])   ) for k,g in it.groupby(sorted(data), key=lambda m: m[0])]

Remember that the data is sorted, so you could also do: 请记住，数据已排序，因此您还可以执行以下操作：

>>> [ (k, list(zip(*g))[1][-1]   ) for k,g in it.groupby(sorted(data), key=lambda m: m[0])]

根据最大值删除元组列表中的重复项

问题描述

3 个解决方案

解决方案1
3 2018-09-06 23:30:14

解决方案2
2 已采纳 2018-09-06 23:29:31

解决方案3
1 2018-09-06 23:54:15

根据最大值删除元组列表中的重复项

问题描述

3 个解决方案

解决方案1 3 2018-09-06 23:30:14

解决方案2 2 已采纳 2018-09-06 23:29:31

解决方案3 1 2018-09-06 23:54:15

解决方案1
3 2018-09-06 23:30:14

解决方案2
2 已采纳 2018-09-06 23:29:31

解决方案3
1 2018-09-06 23:54:15