简体   繁体   中英

How to sort a list of lists based on values of another list of lists?

My two list of lists are:

lst_1 = [[1, 'John'], [2, 'Mcquin'], [4, 'Paul'], [7, 'Jimmy'], [9, 'Coco'], [11, 'Coco']]
lst_2 = [[3, 'Mcquin', 1], [6, 'Paul', 6], [5, 'John', 15], [12, 'Coco', 18], [8, 'Jimmy', 24], [10, 'Coco', 24]]

What is the most efficient way of sorting lst_1 based on the second value in the sublist of lst_2 ( lst_2[i][1] )? Preferred output:

[[2, 'Mcquin'], [4, 'Paul'], [1, 'John'], [9, 'Coco'], [7, 'Jimmy'], [11, 'Coco']]

It doesn't matter if there are duplicates of the same name (Coco in this case). Also, the lists will always contain the same names like here.

If both your lists have the same amount of names, you could store the indices of each element in a collections.defaultdict , then pop off each index and use it as the sort key when an item is found during sorting.

Demo:

from collections import defaultdict, deque

lst_1 = [[1, 'John'], [2, 'Mcquin'], [4, 'Paul'], [7, 'Jimmy'], [9, 'Coco'], [11, 'Coco']]
lst_2 = [[3, 'Mcquin', 1], [6, 'Paul', 6], [5, 'John', 15], [12, 'Coco', 18], [8, 'Jimmy', 24], [10, 'Coco', 24]]

sort_map = defaultdict(deque)
for i, x in enumerate(lst_2):
    sort_map[x[1]].append(i)

result = sorted(lst_1, key=lambda x: sort_map[x[1]].popleft())

print(result)

Output:

[[2, 'Mcquin'], [4, 'Paul'], [1, 'John'], [9, 'Coco'], [7, 'Jimmy'], [11, 'Coco']]. 

Note: You can use collections.deque to pop off elements from the beginning in constant time, as shown above. This minor improvement allows the above solution to remain at overall O(NlogN), which is the cost of sorting.

Edit: I think I have an O(n) solution!


Originally, I thought we could create a dictionary of the names and the indexes they should appear in the final list based on lst_2 . Then we could create the final list by sorting lst_1 - giving an O(n log(n)) solution.

However, the problem with that method is that there are duplicate names in lst_2 ! Also, this new method even has a better time complexity!


First we create a dictionary based on lst_1 where each key is a name and each value is a list collections.deque (thanks RoadRunner) of the numbers which correspond to that name.

By using a deque , we maintain the ordering of those elements in lst_1 with the same names. Also, we have the ability to call .popleft on a deque in O(1) time.

This then allows us to iterate over lst_2 (removing the need for any sorting as it is already in order) and append to a new list the name followed by the first entry of values in the dictionary we created.

If we use .popleft() to get the first element, we also remove it meaning that when that name next comes up in lst_2 , we get the next value in lst_1 .

So, here's the code:

import collections
vals = {}
for v, n in lst_1:
    vals.setdefault(n, collections.deque()).append(v)

#vals == {'Paul': [4], 'Coco': [9, 11], 'John': [1], 'Mcquin': [2], 'Jimmy': [7]}
#        (each key here ^^ is actually a deque but it's easier to see with lists)
r = []
for _,n,_ in lst_2:
    r.append([n, vals[n].popleft()])

giving r (for result) as:

[['Mcquin', 2], ['Paul', 4], ['John', 1], ['Coco', 9], ['Jimmy', 7], ['Coco', 11]]

Very not pythonish but still easy to understand and working:

lst_new = []
for item in lst_2:
    name = item[1]
    for item2 in lst_1:
        if name == item2[1]:
            lst_new.append(list.copy(item2))
            lst_1.remove(item2)
            #item2[1] = "" is also an option but it's worse for long inputs
            break

Output:

>>> lst_new
[[2, 'Mcquin'], [4, 'Paul'], [1, 'John'], [9, 'Coco'], [7, 'Jimmy'], [11, 'Coco']]

Given two lists:

xs = [[4, 'a'], [3, 'b'], [7, 'c'], [10, 'd']]
ys = [ 7, 3, 4, 10]

the following line sorts the list xs by the order of items in ys :

[x for y in ys for x in xs if x[0] == y]

Result:

>>> [x for y in ys for x in xs if x[0] == y]
[[7, 'c'], [3, 'b'], [4, 'a'], [10, 'd']]

Try this:

l = sorted(lst_1, key=lambda x: [i[2] for i in lst_2 if i[1] == x[1]][0])

Explanation: We're sorting with the key being the 3rd value (i[2]) from lst_2 only if the 2nd value matches the argument (i[1] == x[1]).

Note that if a value that exist in lst_1 is missing from lst_2 an error will result (perhaps justifiably, since a key is missing).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM