简体   繁体   中英

counting inversions with merge sort gives a negative number if the array length is 100000

I am still a beginner at programming and i am taking an online course (algorithms)

one of the practice questions was to count the number of inversions in a file containing 100000 numbers randomly ordered. I have tried this code on small data sets and it worked fine but when passing the actual data set it gives inversion count in negative number. Tried various solutions from different platforms but still couldn't resolve it yet.

so this is my code

#include "stdafx.h"
#include <iostream>;
#include <conio.h>:
#include <fstream> 

using namespace std;

long merge(int a[], int start, int mid, int end) 
    int i = start; 
    int j = mid + 1; 
    int k = start; 
    int inversion=0;
    int temp[100000];

    while (i <= mid && j <= end)
    {
        if (a[i] < a[j])  
        {
            temp[k++] = a[i++]; 
        }
        else 
        {
            temp[k++] = a[j++]; 
            inversion =inversion + (mid - i);
        }
    }
    while (i <= mid) 
    {
        temp[k++] = a[i++]; 
    }
    while (j <= end) 
    {
        temp[k++] = a[j++]; 
    }

    for (int i = start; i <= end; i++)
    {
        a[i] = temp[i]; 
    }
    return inversion;

long Msort(int a[], int start,int end)
{
    if (start >= end)
    {
        return 0;
    }
    int inversion = 0;
    int mid = (start + end) / 2;

    inversion += Msort(a, start, mid);
    inversion += Msort(a, mid + 1, end); 

    inversion += merge(a, start, mid, end)
    return inversion;
}

long ReadFromFile(char FileName[], int storage[],int n)
{
    int b;
    int count=0;
    ifstream get(FileName);
    if (!get)
    {
        cout << "no file found";
    }
    while (!get.eof())
    {
        get >> storage[count];
        count++;
    }
    b = count;
    return b;
}

int main()
{
    int valuescount = 0;
    int arr[100000];
    char filename[] = { "file.txt" };
    long n = sizeof(arr) / sizeof(arr[0]);
    valuescount=ReadFromFile(filename, arr,n);
    int no_Of_Inversions = Msort(arr, 0, valuescount -1);
    cout << endl << "No of inversions are" << '\t' << no_Of_Inversions <<'\t';
    cout <<endl<< "Total no of array values sorted"<< valuescount<<endl;
    system("pause");
}
`

The issue with your code is not directly related to the input size. Rather, in an indirect way, the negative number of inversions you find is the result of an overflow in the variable inversion of the function merge .

Consider the case for your input size N = 100000 . If this array of numbers is sorted in decreasing order, then all the ordered pairs in that array will be an inversion. In other words, there will be N * (N-1) / 2 inversions to be counted. As you may have noticed, that value is slightly higher than the bounds of unsigned int type. Consequently, when you try and count this value in a variable of type int, overflow occurs, leading to a negative result.

To remedy this issue, you should change the type of the variable inversion from int to long long , in functions merge and Msort . (You should also update the return type of the functions merge and Msort ) Naturally, you should assign the return value of the Msort call in the main function to a variable of type long long as well. In other words, change the type of variable no_Of_Inversions into a long long as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM