简体   繁体   中英

Sort a list of tuples in consecutive order

I want to sort a list of tuples in a consecutive order , so the first element of each tuple is equal to the last element of the previous one.

For example:

input = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]
output = [(10, 7), (7, 13), (13, 4), (4, 9), (9, 10)]

I have developed a search like this:

output=[]
given = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]
t = given[0][0]
for i in range(len(given)):
      # search tuples starting with element t
      output += [e for e in given if e[0] == t]
      t = output[-1][-1] # Get the next element to search

print(output)    

Is there a pythonic way to achieve such order? And a way to do it "in-place" (with only a list)?

In my problem, the input can be reordered in a circular way using all the tuples, so it is not important the first element chosen.

Assuming your tuples in the list will be circular, you may use dict to achieve it within complexity of O(n) as:

input = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]
input_dict = dict(input)  # Convert list of `tuples` to dict

elem = input[0][0]  # start point in the new list

new_list = []  # List of tuples for holding the values in required order

for _ in range(len(input)):
    new_list.append((elem, input_dict[elem]))
    elem = input_dict[elem]
    if elem not in input_dict:
        # Raise exception in case list of tuples is not circular
        raise Exception('key {} not found in dict'.format(elem))

Final value hold by new_list will be:

>>> new_list
[(10, 7), (7, 13), (13, 4), (4, 9), (9, 10)]

if you are not afraid to waste some memory you could create a dictionary start_dict containing the start integers as keys and the tuples as values and do something like this:

tpl = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]
start_dict = {item[0]: item for item in tpl}

start = tpl[0][0]
res = []
while start_dict:
    item = start_dict[start]
    del start_dict[start]
    res.append(item)
    start = item[-1]

print(res)

if two tuples start with the same number you will lose one of them... if not all the start numbers are used the loop will not terminate.

but maybe this is something to build on.

Actually there're many questions about what you intend to have as an output and what if the input list has invalid structure to do what you need.

Assuming you have an input of pairs where each number is included twice only. So we can consider such input as a graph where numbers are nodes and each pair is an edge. And as far as I understand your question you suppose that this graph is cyclic and looks like this:

10 - 7 - 13 - 4 - 9 - 10 (same 10 as at the beginning)

This shows you that you can reduce the list to store the graph to [10, 7, 13, 4, 9] . And here is the script that sorts the input list:

# input
input = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]

# sorting and archiving
first = input[0][0]
last = input[0][1]
output_in_place = [first, last]

while last != first:
    for item in input:
        if item[0] == last:
            last = item[1]
            if last != first:
                output_in_place.append(last)

print(output_in_place)

# output
output = []
for i in range(len(output_in_place) - 1):
    output.append((output_in_place[i], output_in_place[i+1]))
output.append((output_in_place[-1], output_in_place[0]))

print(output)

I would first create a dictionary of the form

{first_value: [list of tuples with that first value], ...}

Then work from there:

from collections import defaultdict

chosen_tuples = input[:1]  # Start from the first

first_values = defaultdict()
for tup in input[1:]:
    first_values[tup[0]].append(tup)

while first_values:  # Loop will end when all lists are removed
    value = chosen_tuples[-1][1]  # Second item of last tuple
    tuples_with_that_value = first_values[value]
    chosen_tuples.append(tuples_with_that_value.pop())
    if not chosen_with_that_value:
        del first_values[value]  # List empty, remove it

You can try this:

input = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]

output = [input[0]]  # output contains the first element of input
temp = input[1:]  # temp contains the rest of elements in input

while temp:
    item = [i for i in temp if i[0] == output[-1][1]].pop()  # We compare each element with output[-1]
    output.append(item)  # We add the right item to output
    temp.remove(item)  # We remove each handled element from temp

Output:

>>> output
[(10, 7), (7, 13), (13, 4), (4, 9), (9, 10)]

this is a (less efficient than the dictionary version) variant where the list is changed in-place:

tpl = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]

for i in range(1, len(tpl)-1):   # iterate over the indices of the list
    item = tpl[i]
    for j, next_item in enumerate(tpl[i+1:]):  # find the next item 
                                               # in the remaining list
        if next_item[0] == item[1]:
            next_index = i + j
            break
    tpl[i], tpl[next_index] = tpl[next_index], tpl[i]  # now swap the items

here is a more efficient version of the same idea:

tpl = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]
start_index = {item[0]: i for i, item in enumerate(tpl)}

item = tpl[0]
next_index = start_index[item[-1]]
for i in range(1, len(tpl)-1):
    tpl[i], tpl[next_index] = tpl[next_index], tpl[i]
    # need to update the start indices:
    start_index[tpl[next_index][0]] = next_index
    start_index[tpl[i][0]] = i
    next_index = start_index[tpl[i][-1]]
print(tpl)

the list is changed in-place; the dictionary only contains the starting values of the tuples and their index in the list.

Here is a robust solution using the sorted function and a custom key function:

input = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]

def consec_sort(lst):
    def key(x):
        nonlocal index
        if index <= lower_index:
            index += 1
            return -1
        return abs(x[0] - lst[index - 1][1])
    for lower_index in range(len(lst) - 2):
        index = 0
        lst = sorted(lst, key=key)
    return lst

output = consec_sort(input)
print(output)

The original list is not modified. Note that sorted is called 3 times for your input list of length 5. In each call, one additional tuple is placed correctly. The first tuple keeps it original position.

I have used the nonlocal keyword, meaning that this code is for Python 3 only (one could use global instead to make it legal Python 2 code).

My two cents:

def match_tuples(input):
    # making a copy to not mess up with the original one
    tuples = input[:]          # [(10,7), (4,9), (13, 4), (7, 13), (9, 10)]
    last_elem = tuples.pop(0)  # (10,7)

    # { "first tuple's element": "index in list"}
    indexes = {tup[0]: i for i, tup in enumerate(tuples)} # {9: 3, 4: 0, 13: 1, 7: 2}

    yield last_elem  # yields de firts element

    for i in range(len(tuples)):
        # get where in the list is the tuple which first element match the last element in the last tuple
        list_index = indexes.get(last_elem[1])
        last_elem = tuples[list_index] # just get that tuple
        yield last_elem

Output :

input = [(10,7), (4,9), (13, 4), (7, 13), (9, 10)]
print(list(match_tuples(input)))
# output: [(10, 7), (7, 13), (13, 4), (4, 9), (9, 10)]

To get a O(n) algorithm one needs to make sure that one doesn't do a double-loop over the array. One way to do this is by keeping already processed values in some sort of lookup-table (a dict would be a good choice).

For example something like this (I hope the inline comments explain the functionality well). This modifies the list in-place and should avoid unnecessary (even implicit) looping over the list:

inp = [(10, 7), (4, 9), (13, 4), (7, 13), (9, 10)]

# A dictionary containing processed elements, first element is
# the key and the value represents the tuple. This is used to
# avoid the double loop
seen = {}

# The second value of the first tuple. This must match the first
# item of the next tuple
current = inp[0][1]

# Iteration to insert the next element
for insert_idx in range(1, len(inp)):
    # print('insert', insert_idx, seen)
    # If the next value was already found no need to search, just
    # pop it from the seen dictionary and continue with the next loop
    if current in seen:
        item = seen.pop(current)
        inp[insert_idx] = item
        current = item[1]
        continue

    # Search the list until the next value is found saving all
    # other items in the dictionary so we avoid to do unnecessary iterations
    # over the list.
    for search_idx in range(insert_idx, len(inp)):
        # print('search', search_idx, inp[search_idx])
        item = inp[search_idx]
        first, second = item
        if first == current:
            # Found the next tuple, break out of the inner loop!
            inp[insert_idx] = item
            current = second
            break
        else:
            seen[first] = item

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM