简体   繁体   English

通过一个元组的第一个元素和另一个元组的第二个元素相等来排序元组列表

[英]Ordering a list of tuples by equality of the 1st element of one tuple and the 2nd element of another tuple

I have a list of tuples representing points (x, y) and want to order them such that if x_i of a point p_i is equal to y_j of another point p_j . 我有一个表示点(x, y)的元组列表,并希望对它们进行排序,以便如果点p_i x_i等于另一个点p_j y_j The points are such that x and y are never repeating between the points, eg given the point (1,2), the points (1,y) or (x, 2) for any x and y are not allowed. 这些点使得x和y在这些点之间永不重复,例如,给定点(1,2),则不允许任何x和y的点(1,y)或(x,2)。 For example: 例如:

points = [(1, 5), (3, 4), (5, 3), (4, 1), (7,2), (2, 6)]  # valid points

should be ordered as [(1, 5), (5, 3), (3, 4), (4, 1), (7, 2), (2, 6)] 应按[(1, 5), (5, 3), (3, 4), (4, 1), (7, 2), (2, 6)]排序

Here is the code I wrote to do this: 这是我为此编写的代码:

N = len(points)
for i in range(N):
    for j in range(i + 1, N):
        if points[i][1] == points[j][0]:
            points.insert(i + 1, points.pop(j))
            break

Unfortunately the complexity of this is O(N^2) and for a big list of points it is slow. 不幸的是,它的复杂度是O(N ^ 2),并且对于大量的点来说很慢。 Is there a way to do this faster? 有没有办法更快地做到这一点?

Thinking of your unordered list as the description of a directed graph where every node is in some unique chain, you can have the following abstraction. 将无序列表视为对有向图的描述,其中每个节点都位于某个唯一链中,则可以具有以下抽象。

points = [(1, 5), (3, 4), (5, 3), (4, 1), (7,2), (2, 6)]

# Create the graph and initialize the list of chains
graph, chains, seen = dict(points), [], set()

# Find the chains in the graph
for node, target in graph.items():
    while node not in seen:
        seen.add(node)
        chains.append((node, target))
        node = target
        try:
            target = graph[target]
        except KeyError:
            break

# chains : [(1, 5), (5, 3), (3, 4), (4, 1), (7, 2), (2, 6)]

This gives us an algorithm that runs in O(n) . 这为我们提供了一种在O(n)中运行的算法。

You can convert your searches to O(1) time by caching lists of points that have the same first term. 您可以通过缓存具有相同第一项的点列表来将搜索转换为O(1)时间。 (And the caching is O(N) time.) The code to do this gets a little tricky, mainly keeping track of which items have already been processed, but it should work pretty quickly. (并且缓存是O(N)时间。)执行此操作的代码有些棘手,主要是跟踪已经处理过的项目,但是应该可以很快地工作。 Here's an example: 这是一个例子:

from collections import defaultdict, deque

points = [(1, 5), (3, 4), (5, 3), (4, 1), (1,6), (7,2), (3,4), (2,3)]

# make a dictionary of lists of points, grouped by first element
cache = defaultdict(deque)
for i, p in enumerate(points):
    cache[p[0]].append(i)

# keep track of all points that will be processed
points_to_process = set(range(len(points)))

i = 0
next_idx = i
ordered_points = []
while i < len(points):
    # get the next point to be added to the ordered list
    cur_point = points[next_idx]
    ordered_points.append(cur_point)
    # remove this point from the cache (with popleft())
    # note: it will always be the first one in the corresponding list;
    # the assert just proves this and quietly consumes the popleft()
    assert next_idx == cache[cur_point[0]].popleft()
    points_to_process.discard(next_idx)
    # find the next item to add to the list
    try:
        # get the first remaining point that matches this
        next_idx = cache[cur_point[1]][0]
    except IndexError:
        # no matching point; advance to the next unprocessed one
        while i < len(points):
            if i in points_to_process:
                next_idx = i
                break
            else:
                i += 1

ordered_points
# [(1, 5), (5, 3), (3, 4), (4, 1), (1, 6), (7, 2), (2, 3), (3, 4)]

You can avoid creating the points_to_process set to save memory (and maybe time), but the code gets more complex: 您可以避免创建points_to_process设置以节省内存(可能还节省时间),但是代码变得更加复杂:

from collections import defaultdict, deque

points = [(1, 5), (3, 4), (5, 3), (4, 1), (1,6), (7,2), (3,4), (2,3)]

# make a dictionary of lists of points, grouped by first element
cache = defaultdict(deque)
for i, p in enumerate(points):
    cache[p[0]].append(i)

i = 0
next_idx = i
ordered_points = []
while i < len(points):
    # get the next point to be added to the ordered list
    cur_point = points[next_idx]
    ordered_points.append(cur_point)
    # remove this point from the cache
    # note: it will always be the first one in the corresponding list
    assert next_idx == cache[cur_point[0]].popleft()
    # find the next item to add to the list
    try:
        next_idx = cache[cur_point[1]][0]
    except IndexError:
        # advance to the next unprocessed point
        while i < len(points):
            try:
                # see if i points to an unprocessed point (will always be first in list)
                assert i == cache[points[i][0]][0]
                next_idx = i
                break
            except (AssertionError, IndexError) as e:
                # no longer available, move on to next point
                i += 1

ordered_points
# [(1, 5), (5, 3), (3, 4), (4, 1), (1, 6), (7, 2), (2, 3), (3, 4)]

Thanks everyone for the help. 谢谢大家的帮助。 Here is my own solution using numpy and a while loop (a lot slower than the solution by Matthias Fripp, but faster than using two for-loops as in the question's code): 这是我自己使用numpy和while循环的解决方案(比Matthias Fripp的解决方案慢很多,但是比使用问题代码中的两个for循环要快):

# example of points
points = [(1, 5), (17, 2),(3, 4), (5, 3), (4, 1), (6, 8), (9, 7), (2, 6)]  

points = np.array(points)
x, y = points[:,0], points[:,1]

N = points.shape[0]
i = 0
idx = [0]
remaining = set(range(1, N))
while len(idx) < N: 
    try:
        i = np.where(x == y[i])[0][0]
        if i in remaining:
            remaining.remove(i)
        else:
            i = remaining.pop()
    except IndexError:
        i = remaining.pop()

    idx.append(i)

list(zip(points[idx][:,0], points[idx][:,1]))
# [(1, 5), (5, 3), (3, 4), (4, 1), (17, 2), (2, 6), (6, 8), (9, 7)]

A recursive divide-and-conquer approach may have a better runtime. 递归的“分而治之”方法可能具有更好的运行时。 Since this isn't really a straightforward sorting problem, you can't just throw together a modified quicksort or whatever. 由于这实际上不是一个简单的排序问题,因此您不能仅将修改后的快速排序或其他任何东西放在一起。 I think a good solution would be a merge algorithm. 我认为一个好的解决方案是合并算法。 Here's some pseudocode that may help. 这是一些可能有用的伪代码。

let points = [(1, 5), (3, 4), (5, 3), (4, 1), (1,6), (7,2), (3,4), (2,3)];
function tupleSort(tupleList):
    if length(tupleList) <= 1:
        return tupleList
    if length(tupleList) == 2:
        //Trivial solution. Only two tuples in the list. They are either
        //swapped or left in place
        if tupleList[0].x == tupleList[1].y
            return reverse(tupleList)
        else:
            return tupleList
    else:
        let length = length(tupleList)
        let firstHalf = tupleSort(tupleList[0 -> length/2])
        let secondHalf = tupleSort(tupleList[length/2 + 1 -> length])
        return merge(firstHalf, secondHalf) 

function merge(firstList, secondList):
    indexOfUnsorted = getNotSorted(firstList)
    if indexOfUnsorted > -1:
        //iterate through the second list and find a x item 
        //that matches the y of the first list and put the
        //second list into the first list at that position
        return mergedLists
    else:
        return append(firstList, secondList)

function getNotSorted(list):
     //iterate once through the list and return -1 if sorted
     //otherwise return the index of the first item whose y value
     //is not equal to the next items x value

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python - 检查元组列表中的第一个元素是否存在于字符串中,如果确实将元组中的第二个元素添加到字符串中 - Python - check if 1st element in list of tuples exists in string, if it does add 2nd element from tuple to string 检查元组列表是否具有元组的元组作为定义的字符串 - Check that list of tuples has tuple with 1st element as defined string 如何删除属于元组列表的每个元组的第二个元素? - How to remove the 2nd element of every tuple that belongs to a list of tuples? 无法获取列表的第一个和第二个元素 - Cannot get 1st and 2nd element of a list 从列表列表中删除第一个元素,然后是第一个和第二个元素[保留] - Removing the 1st element then 1st and 2nd element from a list of lists [on hold] for循环跳过第一和第二元素 - for loop skipping 1st and 2nd element 如何删除三元组列表中每个元组的第二个元素? - How to remove the 2nd element of every tuple in a list of 3-ples? 如何在python和C#中通过第二个元组元素对列表进行排序 - How to sort a list by the 2nd tuple element in python and C# 查找元组中第二个元素的最大值 - Python - Find max of the 2nd element in a tuple - Python 在元组的元组列表中通过内部元组查找元素 - Find an element by inner tuple in a list of a tuple of tuples
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM