[英]Sort tuple list with another list
I have a tuple list to_order
such as:我有一个元组列表
to_order
例如:
to_order = [(0, 1), (1, 3), (2, 2), (3,2)]
And a list which gives the order to apply to the second element of each tuple of to_order
:还有一个列表,它给出了应用于
to_order
的每个元组的第二个元素的to_order
:
order = [2, 1, 3]
So I am looking for a way to get this output:所以我正在寻找一种方法来获得这个输出:
ordered_list = [(2, 2), (3,2), (0, 1), (1, 3)]
Any ideas?有任何想法吗?
You can provide a key
that will check the index (of the second element) in order
and sort based on it:您可以提供一个
key
来按order
检查(第二个元素的)索引并根据它进行排序:
to_order = [(0, 1), (1, 3), (2, 2), (3,2)]
order = [2, 1, 3]
print(sorted(to_order, key=lambda item: order.index(item[1]))) # [(2, 2), (3, 2), (0, 1), (1, 3)]
EDIT编辑
Since, a discussion on time complexities was start... here ya go, the following algorithm runs in O(n+m)
, using Eric's input example:因为,关于时间复杂性的讨论开始了......在这里,下面的算法在
O(n+m)
,使用 Eric 的输入示例:
N = 5
to_order = [(randrange(N), randrange(N)) for _ in range(10*N)]
order = list(set(pair[1] for pair in to_order))
shuffle(order)
def eric_sort(to_order, order):
bins = {}
for pair in to_order:
bins.setdefault(pair[1], []).append(pair)
return [pair for i in order for pair in bins[i]]
def alfasin_new_sort(to_order, order):
arr = [[] for i in range(len(order))]
d = {k:v for v, k in enumerate(order)}
for item in to_order:
arr[d[item[1]]].append(item)
return [item for sublist in arr for item in sublist]
from timeit import timeit
print("eric_sort", timeit("eric_sort(to_order, order)", setup=setup, number=1000))
print("alfasin_new_sort", timeit("alfasin_new_sort(to_order, order)", setup=setup, number=1000))
OUTPUT:输出:
eric_sort 59.282021682999584
alfasin_new_sort 44.28244407700004
You can distribute the tuples in a dict of lists according to the second element and iterate over order
indices to get the sorted list:您可以根据第二个元素将元组分布在列表字典中,并遍历
order
索引以获取排序列表:
from collections import defaultdict
to_order = [(0, 1), (1, 3), (2, 2), (3, 2)]
order = [2, 1, 3]
bins = defaultdict(list)
for pair in to_order:
bins[pair[1]].append(pair)
print(bins)
# defaultdict(<class 'list'>, {1: [(0, 1)], 3: [(1, 3)], 2: [(2, 2), (3, 2)]})
print([pair for i in order for pair in bins[i]])
# [(2, 2), (3, 2), (0, 1), (1, 3)]
sort
or index
aren't needed and the output is stable.不需要
sort
或index
,输出稳定。
This algorithm is similar to the mapping
mentioned in the supposed duplicate .该算法类似于假设的重复中提到的
mapping
。 This linked answer only works if to_order
and order
have the same lengths, which isn't the case in OP's question.此链接答案仅在
to_order
和order
具有相同长度时才有效,而在 OP 的问题中并非如此。
This algorithm iterates twice over each element of to_order
.该算法对
to_order
每个元素迭代两次。 The complexity is O(n)
.复杂度是
O(n)
。 @alfasin's first algorithm is much slower ( O(n * m * log n)
), but his second one is also O(n)
. @alfasin 的第一个算法要慢得多(
O(n * m * log n)
),但他的第二个算法也是O(n)
。
Here's a list with 10000 random pairs between 0
and 1000
.这是一个列表,其中包含
0
到1000
之间的 10000 个随机对。 We extract the unique second elements and shuffle them in order to define order
:我们提取唯一的第二个元素并将它们打乱以定义
order
:
from random import randrange, shuffle
from collections import defaultdict
from timeit import timeit
from itertools import chain
N = 1000
to_order = [(randrange(N), randrange(N)) for _ in range(10*N)]
order = list(set(pair[1] for pair in to_order))
shuffle(order)
def eric(to_order, order):
bins = defaultdict(list)
for pair in to_order:
bins[pair[1]].append(pair)
return list(chain.from_iterable(bins[i] for i in order))
def alfasin1(to_order, order):
arr = [[] for i in range(len(order))]
d = {k:v for v, k in enumerate(order)}
for item in to_order:
arr[d[item[1]]].append(item)
return [item for sublist in arr for item in sublist]
def alfasin2(to_order, order):
return sorted(to_order, key=lambda item: order.index(item[1]))
print(eric(to_order, order) == alfasin1(to_order, order))
# True
print(eric(to_order, order) == alfasin2(to_order, order))
# True
print("eric", timeit("eric(to_order, order)", globals=globals(), number=100))
# eric 0.3117517130003762
print("alfasin1", timeit("alfasin1(to_order, order)", globals=globals(), number=100))
# alfasin1 0.36100843100030033
print("alfasin2", timeit("alfasin2(to_order, order)", globals=globals(), number=100))
# alfasin2 15.031453827000405
Another solution: [item for key in order for item in filter(lambda x: x[1] == key, to_order)]
另一种解决方案:
[item for key in order for item in filter(lambda x: x[1] == key, to_order)]
This solution works off of order
first, filtering to_order
for each key
in order
.此解决方案的工作原理断
order
第一,过滤to_order
为每个key
在order
。
Equivalent:相等的:
ordered = []
for key in order:
for item in filter(lambda x: x[1] == key, to_order):
ordered.append(item)
Shorter, but I'm not aware of a way to do this with list comprehension:更短,但我不知道有什么方法可以通过列表理解来做到这一点:
ordered = []
for key in order:
ordered.extend(filter(lambda x: x[1] == key, to_order))
Note: This will not throw a ValueError
if to_order
contains a tuple x
where x[1]
is not in order
.注意:如果
to_order
包含一个元组x
其中x[1]
is not in order
这不会抛出ValueError
。
I personally prefer the list
objects sort
function rather than the built-in sort
which generates a new list rather than changing the list in place.我个人比较喜欢
list
对象sort
函数,而不是内置的sort
,产生一个新的列表,而不是在地方改变列表。
to_order = [(0, 1), (1, 3), (2, 2), (3,2)]
order = [2, 1, 3]
to_order.sort(key=lambda x: order.index(x[1]))
print(to_order)
>[(2, 2), (3, 2), (0, 1), (1, 3)]
A little explanation on the way: The
key
parameter of the sort method basicallypreprocesses
the list andranks
all the values based on a measure.在路上一点解释:该
key
的排序方法的参数基本上preprocesses
名单和ranks
基于一个指标的所有值。 In our caseorder.index()
looks at the first occurrence of the currently processed item and returns its position.在我们的例子中
order.index()
查看当前处理的项目的第一次出现并返回它的位置。
x = [1,2,3,4,5,3,3,5]
print x.index(5)
>4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.