count inversions in mergesort python

Question

I want to count how many inversions there are in a list while sorting the list using mergesort. This is my code so far where 'x' counts the ammount of inversions while the rest sorts it:

import sys
x = 0

def merge_sort(A):
    merge_sort2(A, 0, len(A) - 1)


def merge_sort2(A, first, last):
    if first < last:
        middle = (first + last) // 2
        merge_sort2(A, first, middle)
        merge_sort2(A, middle + 1, last)
        merge(A, first, middle, last)


def merge(A, first, middle, last):
    global x
    L = A[first:middle + 1]
    R = A[middle + 1:last + 1]
    L.append(sys.maxsize)
    R.append(sys.maxsize)
    i = j = 0

    for k in range(first, last + 1):
        if L[i] <= R[j]:
            A[k] = L[i]
            i += 1
        else:
            A[k] = R[j]
            j += 1
            x += 1
            x += len(L[first + 1:])

When I call merge sort using a list, the variable x is support to give the amount of inversions in the list. So If the list was '[4,3,2,1], x would be 6. If the list was [1,2,3] x would be 0. I change the value of x whenever the right is greater than the left in the merge definition however, the number always gets way too big. What am I doing wrong?

Answer 1

Check my work but, I think instead of:

x += 1
x += len(L[first + 1:])

you want:

x += middle + 1 + j - k

basically, you want to add the difference between where item k is actually coming from, and where you'd expect it to come from if everything was already sorted.

Answer 2

Your merge step is a little hard for me to understand — I'm not sure why you are doing this (maybe just another way to merge?):

L.append(sys.maxsize)
R.append(sys.maxsize)

but I couldn't get everything to work out with the extra elements added to the partitions. And I think you end up counting the extra element in L as an inversion with each merge move from R

I think that's causing some of the problems. But you also have two other issues:

Your last line isn't quite the right logic:

 x += len(L[first + 1:])

the number of inversions will the number of elements in L that you jump over. You're counting almost every element of L each time. Something like this works better:

x += len(L[i:])

and then at the end, you may have elements left over whose inversions you haven't counted yet. Maybe that's not an issue with your extra elements but in a more traditional merge it is. Here's the way I would count the inversions:

def merge(A, first, middle, last):
    global x
    L = A[first:middle+1]
    R = A[middle+1:last+1]
    i = j = 0
    k = first
    print(L, R)
    while i<len(L) and j<len(R):
        if L[i] <= R[j]:
            A[k] = L[i]
            i += 1
        else:
            A[k] = R[j]
            j += 1
            # count how many left in L 
            x += len(L[i:]) 
        k += 1
    # take care of any leftovers in L or R
    while i < len(L):
        A[k] = L[i]
        i += 1
        k+=1
    while j < len(R):
        A[k] = R[j]
        j += 1
        k+=1
        x += len(L[i:])

count inversions in mergesort python

Question

2 answers

solution1
0 2018-03-29 22:58:57

solution2
0 2018-03-30 00:07:33

count inversions in mergesort python

Question

2 answers

solution1 0 2018-03-29 22:58:57

solution2 0 2018-03-30 00:07:33

solution1
0 2018-03-29 22:58:57

solution2
0 2018-03-30 00:07:33