简体   繁体   English

如何根据另一个列表列表的值对列表列表进行排序?

[英]How to sort a list of lists based on values of another list of lists?

My two list of lists are: 我的两个列表是:

lst_1 = [[1, 'John'], [2, 'Mcquin'], [4, 'Paul'], [7, 'Jimmy'], [9, 'Coco'], [11, 'Coco']]
lst_2 = [[3, 'Mcquin', 1], [6, 'Paul', 6], [5, 'John', 15], [12, 'Coco', 18], [8, 'Jimmy', 24], [10, 'Coco', 24]]

What is the most efficient way of sorting lst_1 based on the second value in the sublist of lst_2 ( lst_2[i][1] )? 根据lst_2lst_2[i][1] )子列表中的第二个值对lst_1进行排序的最有效方法是什么? Preferred output: 首选输出:

[[2, 'Mcquin'], [4, 'Paul'], [1, 'John'], [9, 'Coco'], [7, 'Jimmy'], [11, 'Coco']]

It doesn't matter if there are duplicates of the same name (Coco in this case). 如果有相同名称的重复(在这种情况下为Coco)并不重要。 Also, the lists will always contain the same names like here. 此外,列表将始终包含与此处相同的名称。

If both your lists have the same amount of names, you could store the indices of each element in a collections.defaultdict , then pop off each index and use it as the sort key when an item is found during sorting. 如果两个列表的名称数量相同,则可以将每个元素的索引存储在collections.defaultdict ,然后弹出每个索引,并在排序期间找到项目时将其用作排序键。

Demo: 演示:

from collections import defaultdict, deque

lst_1 = [[1, 'John'], [2, 'Mcquin'], [4, 'Paul'], [7, 'Jimmy'], [9, 'Coco'], [11, 'Coco']]
lst_2 = [[3, 'Mcquin', 1], [6, 'Paul', 6], [5, 'John', 15], [12, 'Coco', 18], [8, 'Jimmy', 24], [10, 'Coco', 24]]

sort_map = defaultdict(deque)
for i, x in enumerate(lst_2):
    sort_map[x[1]].append(i)

result = sorted(lst_1, key=lambda x: sort_map[x[1]].popleft())

print(result)

Output: 输出:

[[2, 'Mcquin'], [4, 'Paul'], [1, 'John'], [9, 'Coco'], [7, 'Jimmy'], [11, 'Coco']]. 

Note: You can use collections.deque to pop off elements from the beginning in constant time, as shown above. 注意:您可以使用collections.deque在常量时间从头开始弹出元素,如上所示。 This minor improvement allows the above solution to remain at overall O(NlogN), which is the cost of sorting. 这种微小的改进允许上述解决方案保持在整体O(NlogN),这是分拣的成本。

Edit: I think I have an O(n) solution! 编辑:我想我有一个O(n)解决方案!


Originally, I thought we could create a dictionary of the names and the indexes they should appear in the final list based on lst_2 . 最初,我认为我们可以根据lst_2创建一个dictionary ,列出它们应该出现在最终列表中的名称和索引。 Then we could create the final list by sorting lst_1 - giving an O(n log(n)) solution. 然后我们可以通过对lst_1进行排序来创建最终列表 - 给出一个O(n log(n))解决方案。

However, the problem with that method is that there are duplicate names in lst_2 ! 但是,该方法的问题在于lst_2中存在重复的名称! Also, this new method even has a better time complexity! 此外,这种新方法甚至具有更好的时间复杂度!


First we create a dictionary based on lst_1 where each key is a name and each value is a list collections.deque (thanks RoadRunner) of the numbers which correspond to that name. 首先,我们基于lst_1创建一个字典其中每个key都是一个名称,每个值都是一个 list collections.deque (感谢RoadRunner),其中包含与该名称对应的数字。

By using a deque , we maintain the ordering of those elements in lst_1 with the same names. 通过使用deque ,我们使用相同的名称维护lst_1中这些元素的顺序。 Also, we have the ability to call .popleft on a deque in O(1) time. 此外,我们有能力在O(1)时间.popleft deque上调用.popleft

This then allows us to iterate over lst_2 (removing the need for any sorting as it is already in order) and append to a new list the name followed by the first entry of values in the dictionary we created. 然后,这允许我们迭代lst_2 (不再需要任何排序,因为它已经按顺序排列)并将新名称附加到名称后跟我们创建的字典中的第一个值条目。

If we use .popleft() to get the first element, we also remove it meaning that when that name next comes up in lst_2 , we get the next value in lst_1 . 如果我们用.popleft()来获得的第一个元素,我们也将其删除这意味着当这个名字在明年出现lst_2 ,我们得到的下一个值lst_1

So, here's the code: 那么,这是代码:

import collections
vals = {}
for v, n in lst_1:
    vals.setdefault(n, collections.deque()).append(v)

#vals == {'Paul': [4], 'Coco': [9, 11], 'John': [1], 'Mcquin': [2], 'Jimmy': [7]}
#        (each key here ^^ is actually a deque but it's easier to see with lists)
r = []
for _,n,_ in lst_2:
    r.append([n, vals[n].popleft()])

giving r (for result) as: r (结果)如下:

[['Mcquin', 2], ['Paul', 4], ['John', 1], ['Coco', 9], ['Jimmy', 7], ['Coco', 11]]

Very not pythonish but still easy to understand and working: 非常不诡异,但仍然易于理解和工作:

lst_new = []
for item in lst_2:
    name = item[1]
    for item2 in lst_1:
        if name == item2[1]:
            lst_new.append(list.copy(item2))
            lst_1.remove(item2)
            #item2[1] = "" is also an option but it's worse for long inputs
            break

Output: 输出:

>>> lst_new
[[2, 'Mcquin'], [4, 'Paul'], [1, 'John'], [9, 'Coco'], [7, 'Jimmy'], [11, 'Coco']]

Given two lists: 给出两个列表:

xs = [[4, 'a'], [3, 'b'], [7, 'c'], [10, 'd']]
ys = [ 7, 3, 4, 10]

the following line sorts the list xs by the order of items in ys : 以下行按照ys的项目顺序对列表xs进行排序:

[x for y in ys for x in xs if x[0] == y]

Result: 结果:

>>> [x for y in ys for x in xs if x[0] == y]
[[7, 'c'], [3, 'b'], [4, 'a'], [10, 'd']]

Try this: 尝试这个:

l = sorted(lst_1, key=lambda x: [i[2] for i in lst_2 if i[1] == x[1]][0])

Explanation: We're sorting with the key being the 3rd value (i[2]) from lst_2 only if the 2nd value matches the argument (i[1] == x[1]). 说明:只有当第二个值与参数(i [1] == x [1])匹配时,我们才使用来自lst_2的第3个值(i [2])进行排序。

Note that if a value that exist in lst_1 is missing from lst_2 an error will result (perhaps justifiably, since a key is missing). 请注意,如果lst_2中缺少lst_1中存在的值,则会产生错误(可能是合理的,因为缺少密钥)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM