How to remove duplicates in a list of lists, keeping the element with the highest value based on the second element in the list? Python

Question

I have a list of lists with repeating items in the first entry of the list. I would like to remove duplicates and only keep the items with the highest score (based on the second entry of the list)

list_dup = [["Apple", 24],
["Apple", 23], 
["Sun", 15], 
["Apple", 2], 
["Sun", 1],
["Blue", 15]
]

Output:

list_dup = [["Apple", 24], 
    ["Sun", 15], 
    ["Blue", 15]
    ]

Answer 1

import pandas as pd
pd.DataFrame(list_dup).groupby(0).max().reset_index().values.tolist()

Step by step:

convert the list in a pd.DataFrame ;
group rows according the value of the first column (column 0 , the one containing the strings);
taking the max values for the other columns for each group (in your case you have just one other column, the one containing the integers);
convert the resulting pd.DataFrame in a numpy array (with .values ) and then converting it in a list .

Answer 2

If the order of the output list is not important, you can use sorted to sort the list by the first elements of the sub-lists, then use itertools.groupby to pull the pull together groups based on the first elements, and finally use max to get the highest element based on the second element.

from itertools import groupby

[max(g, key=lambda x: x[1]) for _, g in groupby(sorted(list_dup), key=lambda x: x[0])]
# returns:
[['Apple', 24], ['Blue', 15], ['Sun', 15]]

Answer 3

Many possibilities. One of the clearest may be:

m_d = {}
for k in list_dup:
    if k[0] in m_d:
        if m_d[k[0]] < k[1]:
            m_d[k[0]] = k[1]
    else:
        m_d[k[0]] = k[1]

list_no_dup = [[k, v] for k, v in m_d.items()]

How to remove duplicates in a list of lists, keeping the element with the highest value based on the second element in the list? Python

Question

3 answers

solution1
3 ACCPTED 2022-12-07 13:55:39

solution2
3 2022-12-07 13:59:13

solution3
1 2022-12-07 14:01:43

How to remove duplicates in a list of lists, keeping the element with the highest value based on the second element in the list? Python

Question

3 answers

solution1 3 ACCPTED 2022-12-07 13:55:39

solution2 3 2022-12-07 13:59:13

solution3 1 2022-12-07 14:01:43

solution1
3 ACCPTED 2022-12-07 13:55:39

solution2
3 2022-12-07 13:59:13

solution3
1 2022-12-07 14:01:43