简体   繁体   中英

How to sort a list of lists and and to keep only the maximal 2nd element of each of the 1st elements?

Let's say that I have some list:

lst = [[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]]

I want to sort lst by the first element and for each sublist keep the one with the maximal second element when grouped by the first element.

So the results will be:

results
>>> [[0,2],[1,4],[2,6]]

Can someone kindly help me?

You can do it using np.maximum.reduceat :

import numpy as np
lst = np.array([[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]])
lst = lst[np.argsort(lst[:,0])] #sorting lst by first row
u, idx = np.unique(lst[:,0], return_index = True) 
print(np.c_[u, np.maximum.reduceat(lst[:,1], idx)])

At first array should be sorted. Then you need to get indices that splits array into groups: idx = [0, 2, 4] and corresponding values of first column u = [0, 1, 2] . Finally, use np.maximum.reduceat in order to get maximum values of groups that starts at indices idx specified and display it concatenated rightwise to u .

Remark: I used numpy here, a widely used library that allows to push looping into C level which is much faster. Purely pythonic solutions are worth attention too.

Bonus: This is actually a one liner using a numpy_indexed library (not so widely used) dedicated for groupby operations of arrays:

import numpy_indexed as npi
import numpy as np
np.transpose(npi.group_by(lst[:, 0]).max(lst[:, 1]))

Assuming you just have 'pairs' like this (eg always 2 ints per sublist with the same 1st value and a 2nd value), it's very simple:

>>> lst = [[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]]
>>> sorted(lst)[1::2]
[[0, 2], [1, 4], [2, 6]]

Sorting the list by default sorts on the 1st and then 2nd value of each sublist, then just slice the resulting list to take every other item

Sort the list, group the elements by first item and then keep the max by second item in each group

import itertools as it
from operator import itemgetter

lst = [[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]]

slst = sorted(lst, key=itemgetter(0))
gs = it.groupby(slst, key=itemgetter(0))
res = [max(v, key=itemgetter(1)) for k,v in gs]
print(res)

produces

[[0, 2], [1, 4], [2, 6]]

Try something like the code segment below, which doesn't require any imports .

lst = [[2,6],[1,4],[0,1],[1,1],[2,3],[0,2]]

lst = sorted(lst) # Sort the list in increasing order.
lst = [lst[i] for i in range(len(lst)) if i+1 == len(lst) or lst[i+1][0] != lst[i][0]]
# Remove the elements with minimum 2nd element.

print(lst)

Output:

[[0, 2], [1, 4], [2, 6]]

Another way, using a dict .

>>> [*dict(sorted(lst)).items()]
[(0, 2), (1, 4), (2, 6)]

It produces the pairs as tuples instead of lists, but you even accepted an answer that produces a numpy array. To get lists:

>>> [*map(list, dict(sorted(lst)).items())]
[[0, 2], [1, 4], [2, 6]]

These solutions work because the dict keeps the last value for each key, so if we sort first, then the last is the largest.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM