简体   繁体   中英

Trouble implementing merge sort in Python

Assumption: length of the array is a power of 2 (here 2^3 = 16)

Here's the code I'm working with"

def merge_sort(arr):
   length = len(arr)
   if length <= 1:
      return arr
   else:
      mid = int(length/2)
      L, R = merge_sort(arr[:mid]), merge_sort(arr[mid:])
      return merge(L,R)

def merge(L,R):
   result = []
   low, high = 0, 2*max(len(L),len(R))
   i = j = 0
   for k in range(low,high):
     if L[i] < R[j]:
        result.append(L[i])
        i = i + 1
     else:
        result.append(R[j])
        j = j + 1
return result
   
>arr = [2, 40, 0, 66, 30, 33, 27, 69, 31, 82, 53, 26, 11, 29, 50, 59]
>merge_sort(arr)
IndexError: list index out of range

The merge_sort function is supposed to be returning sorted versions of the first and second half of the array, instead, it's returning a list of the first element of the array. Don't know what I am doing wrong, please help!

My guess is that you have to change

mid = int(length/2)

to

mid = int(length//2)

def merge_sort(arr):
    length = len(arr)
    if length <= 1:
        return arr
    else:
        mid = int(length // 2)
        L, R = merge_sort(arr[:mid]), merge_sort(arr[mid:])
        return merge(L, R)


def merge(L, R):
    result = []
    low, high = 0, 2 * max(len(L), len(R))
    i = j = 0
    for k in range(low, high):
        if L[i] < R[j]:
            result.append(L[i])
            i = i + 1
        else:
            result.append(R[j])
            j = j + 1

        return result

arr = [2, 40, 0, 66, 30, 33, 27, 69, 31, 82, 53, 26, 11, 29, 50, 59]
print(merge_sort(arr))

After you fix that, it seems there are still bugs in your mergesort implementation.


Apart from that, here is an iterative mergesort implementation I've done a while ago, might help:

import random
from typing import TypeVar, List
from scipy import stats

T = TypeVar('T')


def iterative_merge_sort(input_list: List[T]) -> List[T]:
    """
    Iterative Bottom Up Merge Sort
    -----------------------------------------------
    Merge Sort is a Divide and Conquer algorithm.
    It divides the input array in two halves,
    calls itself for the two halves and then merges the two sorted halves.
    The merge function is used for merging two halves.

    In this algorithm, we'd start from the smallest window size of 1
    and double up the window size in each level until we'd reach to the
    last level, which would be the size of input list.

    Attributes:
    - Time Complexity: O(N*Log N)
        - In each level, there are O(N) comparisons
        - There are O(Log N) levels
    - Space Complexity: O(N) it is not an in-place sort
    - Stable Sort

    """

    # Assigns the length of input list.
    size_of_input = len(input_list)

    # Creates a new list for sorted output list.
    temp_output_list = size_of_input * [None]

    # Starts from the window size 1 and doubles up
    window_size = 1
    while window_size <= size_of_input - 1:
        sort_in_level(input_list, temp_output_list, window_size, size_of_input)
        window_size = 2 * window_size

    return temp_output_list


def sort_in_level(input_list: List[T], temp_output_list: List[T],
                  window_size: int, size_of_input: int) -> List[T]:
    """
    In this method we would assign four indices for start and end indices of two sublists:
    - Starting Index for the first sublist
    - Ending Index for the first sublist
    - Starting Index for the second sublist
    - Ending Index for the second sublist

    Then, we'd do the divide and merge
    """

    # Assigns the Starting Index of the first sublist to 0.
    first_start_index = 0
    while first_start_index + window_size <= size_of_input - 1:
        first_end_index = first_start_index + window_size - 1
        second_start_index = first_start_index + window_size
        second_end_index = second_start_index + window_size - 1

        # In case, the Second Ending Index went over the limit, which is the length of input list.
        if second_end_index >= size_of_input:
            second_end_index = size_of_input - 1

        # Merges the two sublists
        merge_sublists(input_list, temp_output_list, first_start_index,
                       first_end_index, second_start_index, second_end_index)

        # Increments the Starting Index for the first sublist for the next sublist merging
        first_start_index = second_end_index + 1

    # In case, any other element would be left, it'd be placed in the temp output list
    for i in range(first_start_index, size_of_input):
        temp_output_list[i] = input_list[i]

    # Copies to the sorted part of the output list.
    copy_sorted_list(input_list, temp_output_list, size_of_input)


def merge_sublists(input_list: List[T], temp_output_list: List[T],
                   first_start_index: int, first_end_index: int,
                   second_start_index: int, second_end_index: int) -> List[T]:
    """
    This method would merges the sublists using three simple while loops:
    - In the first while, we'd continue until the min length in between the two sublists.
    - In the second while, we'd continue if any element would be left from the sublist 1.
    - In the third while, we'd continue if any element would be left from the sublist 2.

e.g., sublist 1 [1, 3, 5, 7, 9]
e.g., sublist 2 [2, 4, 6, 8, 10, 12, 14]

- First while loop generates: [1, 2, 3, 4, 5, 6 , 7, 8, 9, 10]
- Second while loop just passes, since no elements left from the first sublist.
- Third while loop generates: [1, 2, 3, 4, 5, 6 , 7, 8, 9, 10, 12, 14]

    """

    # Assigns the Sublist 1 and temp output list indices (i, k)
    i = k = first_start_index
    # Assigns the Sublist 2 index (j)
    j = second_start_index

    # For if the size of smallest sublists:
    while i <= first_end_index and j <= second_end_index:
        # If the Sublist 1 element is smaller
        if input_list[i] <= input_list[j]:
            # Places the Sublist 1 element in the temp output list
            temp_output_list[k] = input_list[i]
            # Increments the Sublist 1
            i += 1
        else:
            # Places the Sublist 2 element in the temp output list
            temp_output_list[k] = input_list[j]
            # Increments the Sublist 2
            j += 1
        # Increments the temp output list
        k += 1

    # For if any element would be left in Sublist 1
    while i <= first_end_index:
        # Copies the Sublist 1 element in the temp output list
        temp_output_list[k] = input_list[i]
        # Increments the Sublist 1
        i += 1
        # Increments the temp output list
        k += 1

    # For if any element would be left in Sublist 2
    while j <= second_end_index:
        # Copies the Sublist 2 element in the temp output list
        temp_output_list[k] = input_list[j]
        # Increments the Sublist 2
        j += 1
        # Increments the temp output list
        k += 1


def copy_sorted_list(input_list, temp_output_list, size_of_input):
    """
    This method copies the sorted temp output into the input list
    """
    for i in range(size_of_input):
        input_list[i] = temp_output_list[i]


if __name__ == "__main__":

    # Creates a dash line string and a new line for in between the tests.
    delimiter = "-" * 70 + "\n"

    # Generates a random integer list.
    TEST_LIST_INTEGER = random.sample(range(-100, 100), 15) * 3
    print(f"""The unsorted integer array is:
        {TEST_LIST_INTEGER}""")
    print(delimiter)

    # Generates a random float list.
    TEST_LIST_FLOAT = stats.uniform(0, 100).rvs(45)
    print(f"""The unsorted float array is:
        {TEST_LIST_FLOAT}""")
    print(delimiter)

    # Sample float/integer test list for input.
    INTEGER_FLOAT_INPUT = list(TEST_LIST_INTEGER + TEST_LIST_FLOAT)

    # Sample float/integer test list for output.
    INTEGER_FLOAT_OUTPUT = sorted(INTEGER_FLOAT_INPUT)

    sorting_algorithms = [
        ("Iterative Merge Sort", iterative_merge_sort)
    ]

    # Testing
    for description, func in sorting_algorithms:
        if (func(INTEGER_FLOAT_INPUT.copy()) == INTEGER_FLOAT_OUTPUT):
            print(f"{description} Test was Successful.")
        else:
            print(f"{description} Test was not Successful.")
        print(f"""{description} (Integer):
            {func(TEST_LIST_INTEGER.copy())}""")
        print(f"""{description} (Float):
            {func(TEST_LIST_FLOAT.copy())}""")
        print(delimiter)

Output

The unsorted integer array is:
        [-95, -43, -53, -51, -22, -69, 86, -62, 18, 57, -40, 35, 27, 69, 25, -95, -43, -53, -51, -22, -69, 86, -62, 18, 57, -40, 35, 27, 69, 25, -95, -43, -53, -51, -22, -69, 86, -62, 18, 57, -40, 35, 27, 69, 25]
----------------------------------------------------------------------

The unsorted float array is:
        [13.73951669 25.36749628 49.41536391 52.51297191 31.96693221 83.35130179
 78.15310692 87.80745009 71.72315753 31.93650165 39.01372258 54.08165357
  8.71619405 55.70713018 62.89580284 55.38593506 46.19091644 56.25332329
  9.01142232 71.7087899  10.3590102  64.12106354 11.96950344 10.02041072
 11.25998807  3.4682743   4.2306855  56.47684705 12.55798209 31.72904102
 25.58799475 12.36917936 14.79663461 41.44290086  8.07286915 30.53615601
 28.93814861 79.71892673 51.05525118 60.84951152 40.79634336 98.53217422
 40.13052397 37.54257899 30.84492737]
----------------------------------------------------------------------

Iterative Merge Sort Test was Successful.
Iterative Merge Sort (Integer):
            [-95, -95, -95, -69, -69, -69, -62, -62, -62, -53, -53, -53, -51, -51, -51, -43, -43, -43, -40, -40, -40, -22, -22, -22, 18, 18, 18, 25, 25, 25, 27, 27, 27, 35, 35, 35, 57, 57, 57, 69, 69, 69, 86, 86, 86]
Iterative Merge Sort (Float):
            [3.468274302247476, 4.2306854988557685, 8.072869149475581, 8.716194048896797, 9.011422324931061, 10.020410716108207, 10.359010198120444, 11.259988072653337, 11.969503435877515, 12.369179362897519, 12.557982088413011, 13.73951669249841, 14.796634612698023, 25.367496277946678, 25.58799475149002, 28.938148605302082, 30.536156009314432, 30.844927367382567, 31.729041015605798, 31.936501654196327, 31.966932211561073, 37.542578990273, 39.01372257660489, 40.13052397245499, 40.796343357498124, 41.44290086358231, 46.19091644398402, 49.41536391214038, 51.055251182405236, 52.51297190698191, 54.08165357100877, 55.38593505776328, 55.7071301839314, 56.25332328511436, 56.47684704896035, 60.849511522015035, 62.89580283777695, 64.121063540564, 71.70878990293969, 71.7231575324151, 78.15310692082691, 79.71892673486917, 83.35130178696363, 87.80745008766068, 98.53217421868314]
----------------------------------------------------------------------

When all elements from array "L" are merged, the index i try to access "out of range".

I fix your code as following:

def merge(L, R):
    result = []
    i = j = 0
    lenL, lenR = len(L), len(R)
    while i < lenL or j < lenR:
        if j >= lenR or (i < lenL and L[i] < R[j]):
            result.append(L[i])
            i = i + 1
        else:
            result.append(R[j])
            j = j + 1
    return result

It works well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM