[英]Ordering a list of tuples by equality of the 1st element of one tuple and the 2nd element of another tuple
I have a list of tuples representing points (x, y)
and want to order them such that if x_i
of a point p_i
is equal to y_j
of another point p_j
. 我有一个表示点
(x, y)
的元组列表,并希望对它们进行排序,以便如果点p_i
x_i
等于另一个点p_j
y_j
。 The points are such that x and y are never repeating between the points, eg given the point (1,2), the points (1,y) or (x, 2) for any x and y are not allowed. 这些点使得x和y在这些点之间永不重复,例如,给定点(1,2),则不允许任何x和y的点(1,y)或(x,2)。 For example:
例如:
points = [(1, 5), (3, 4), (5, 3), (4, 1), (7,2), (2, 6)] # valid points
should be ordered as [(1, 5), (5, 3), (3, 4), (4, 1), (7, 2), (2, 6)]
应按
[(1, 5), (5, 3), (3, 4), (4, 1), (7, 2), (2, 6)]
排序
Here is the code I wrote to do this: 这是我为此编写的代码:
N = len(points)
for i in range(N):
for j in range(i + 1, N):
if points[i][1] == points[j][0]:
points.insert(i + 1, points.pop(j))
break
Unfortunately the complexity of this is O(N^2) and for a big list of points it is slow. 不幸的是,它的复杂度是O(N ^ 2),并且对于大量的点来说很慢。 Is there a way to do this faster?
有没有办法更快地做到这一点?
Thinking of your unordered list as the description of a directed graph where every node is in some unique chain, you can have the following abstraction. 将无序列表视为对有向图的描述,其中每个节点都位于某个唯一链中,则可以具有以下抽象。
points = [(1, 5), (3, 4), (5, 3), (4, 1), (7,2), (2, 6)]
# Create the graph and initialize the list of chains
graph, chains, seen = dict(points), [], set()
# Find the chains in the graph
for node, target in graph.items():
while node not in seen:
seen.add(node)
chains.append((node, target))
node = target
try:
target = graph[target]
except KeyError:
break
# chains : [(1, 5), (5, 3), (3, 4), (4, 1), (7, 2), (2, 6)]
This gives us an algorithm that runs in O(n) . 这为我们提供了一种在O(n)中运行的算法。
You can convert your searches to O(1) time by caching lists of points that have the same first term. 您可以通过缓存具有相同第一项的点列表来将搜索转换为O(1)时间。 (And the caching is O(N) time.) The code to do this gets a little tricky, mainly keeping track of which items have already been processed, but it should work pretty quickly.
(并且缓存是O(N)时间。)执行此操作的代码有些棘手,主要是跟踪已经处理过的项目,但是应该可以很快地工作。 Here's an example:
这是一个例子:
from collections import defaultdict, deque
points = [(1, 5), (3, 4), (5, 3), (4, 1), (1,6), (7,2), (3,4), (2,3)]
# make a dictionary of lists of points, grouped by first element
cache = defaultdict(deque)
for i, p in enumerate(points):
cache[p[0]].append(i)
# keep track of all points that will be processed
points_to_process = set(range(len(points)))
i = 0
next_idx = i
ordered_points = []
while i < len(points):
# get the next point to be added to the ordered list
cur_point = points[next_idx]
ordered_points.append(cur_point)
# remove this point from the cache (with popleft())
# note: it will always be the first one in the corresponding list;
# the assert just proves this and quietly consumes the popleft()
assert next_idx == cache[cur_point[0]].popleft()
points_to_process.discard(next_idx)
# find the next item to add to the list
try:
# get the first remaining point that matches this
next_idx = cache[cur_point[1]][0]
except IndexError:
# no matching point; advance to the next unprocessed one
while i < len(points):
if i in points_to_process:
next_idx = i
break
else:
i += 1
ordered_points
# [(1, 5), (5, 3), (3, 4), (4, 1), (1, 6), (7, 2), (2, 3), (3, 4)]
You can avoid creating the points_to_process
set to save memory (and maybe time), but the code gets more complex: 您可以避免创建
points_to_process
设置以节省内存(可能还节省时间),但是代码变得更加复杂:
from collections import defaultdict, deque
points = [(1, 5), (3, 4), (5, 3), (4, 1), (1,6), (7,2), (3,4), (2,3)]
# make a dictionary of lists of points, grouped by first element
cache = defaultdict(deque)
for i, p in enumerate(points):
cache[p[0]].append(i)
i = 0
next_idx = i
ordered_points = []
while i < len(points):
# get the next point to be added to the ordered list
cur_point = points[next_idx]
ordered_points.append(cur_point)
# remove this point from the cache
# note: it will always be the first one in the corresponding list
assert next_idx == cache[cur_point[0]].popleft()
# find the next item to add to the list
try:
next_idx = cache[cur_point[1]][0]
except IndexError:
# advance to the next unprocessed point
while i < len(points):
try:
# see if i points to an unprocessed point (will always be first in list)
assert i == cache[points[i][0]][0]
next_idx = i
break
except (AssertionError, IndexError) as e:
# no longer available, move on to next point
i += 1
ordered_points
# [(1, 5), (5, 3), (3, 4), (4, 1), (1, 6), (7, 2), (2, 3), (3, 4)]
Thanks everyone for the help. 谢谢大家的帮助。 Here is my own solution using numpy and a while loop (a lot slower than the solution by Matthias Fripp, but faster than using two for-loops as in the question's code):
这是我自己使用numpy和while循环的解决方案(比Matthias Fripp的解决方案慢很多,但是比使用问题代码中的两个for循环要快):
# example of points
points = [(1, 5), (17, 2),(3, 4), (5, 3), (4, 1), (6, 8), (9, 7), (2, 6)]
points = np.array(points)
x, y = points[:,0], points[:,1]
N = points.shape[0]
i = 0
idx = [0]
remaining = set(range(1, N))
while len(idx) < N:
try:
i = np.where(x == y[i])[0][0]
if i in remaining:
remaining.remove(i)
else:
i = remaining.pop()
except IndexError:
i = remaining.pop()
idx.append(i)
list(zip(points[idx][:,0], points[idx][:,1]))
# [(1, 5), (5, 3), (3, 4), (4, 1), (17, 2), (2, 6), (6, 8), (9, 7)]
A recursive divide-and-conquer approach may have a better runtime. 递归的“分而治之”方法可能具有更好的运行时。 Since this isn't really a straightforward sorting problem, you can't just throw together a modified quicksort or whatever.
由于这实际上不是一个简单的排序问题,因此您不能仅将修改后的快速排序或其他任何东西放在一起。 I think a good solution would be a merge algorithm.
我认为一个好的解决方案是合并算法。 Here's some pseudocode that may help.
这是一些可能有用的伪代码。
let points = [(1, 5), (3, 4), (5, 3), (4, 1), (1,6), (7,2), (3,4), (2,3)];
function tupleSort(tupleList):
if length(tupleList) <= 1:
return tupleList
if length(tupleList) == 2:
//Trivial solution. Only two tuples in the list. They are either
//swapped or left in place
if tupleList[0].x == tupleList[1].y
return reverse(tupleList)
else:
return tupleList
else:
let length = length(tupleList)
let firstHalf = tupleSort(tupleList[0 -> length/2])
let secondHalf = tupleSort(tupleList[length/2 + 1 -> length])
return merge(firstHalf, secondHalf)
function merge(firstList, secondList):
indexOfUnsorted = getNotSorted(firstList)
if indexOfUnsorted > -1:
//iterate through the second list and find a x item
//that matches the y of the first list and put the
//second list into the first list at that position
return mergedLists
else:
return append(firstList, secondList)
function getNotSorted(list):
//iterate once through the list and return -1 if sorted
//otherwise return the index of the first item whose y value
//is not equal to the next items x value
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.