简体   繁体   English

递归合并排序,大型数组的分段错误

[英]recursive merge sort, segmentation fault for big arrays

I have implemented the merge sort recursively. 我已经递归地实现了合并排序。 It works up to a certain size of the sorted array then it crashes with "segmentation fault". 它可以处理排序数组的特定大小,然后由于“分段错误”而崩溃。 At the Intel Xeon, 16GB, the maximum float array size is 17352, higher for int array, lower for double array. 在Intel Xeon(16GB)中,最大浮点数组大小为17352,对于int数组,最大,对于双精度数组,较小。 At the AMD A10, 16GB, the limit is 2068 for floats. 在16 GB的AMD A10上,浮动限制为2068。 Clearly there is a memory issue. 显然存在内存问题。 Other sorting algorithms I did for arrays (non recursively) work fine for up to ~2e6. 我为数组(非递归)所做的其他排序算法在〜2e6的情况下都能正常工作。 The compiler is GCC 4.4.7. 编译器为GCC 4.4.7。 How do I improve this merge sort so it works for the bigger arrays? 如何改善这种合并排序,使其适用于更大的数组?

#include <iostream>
#include <stdlib.h>
#include <cmath>
#include <vector>
using namespace std;

// --------------------------------------------------------
// merge 2 subarrays of 1 array around its middle im
template <class C>
void merge(C* arr, int ilow, int imid, int ihigh)
{
vector<C> temp; // array seg faults earlier than vector
for(int i=ilow; i<=ihigh; i++) temp.push_back(arr[i]); // copy array

int i1=ilow, i2=imid+1, ai=ilow; // starting positions

while(i1<=imid && i2<=ihigh) // compare 1st and 2nd halves
{
    if(temp[i1]<=arr[i2])
    {
        arr[ai] = temp[i1];
        i1++; // leave smaller val behind
    }
    else
    {
        arr[ai] = temp[i2];
        i2++; // leave smaller val behind
    }
    ai++; // move forward
}

if(i2>ihigh) while(i1<=imid) // if 2nd is done, copy the rest from 1st
{
    arr[ai] = temp[i1];
    i1++;
    ai++;
}

if(i1>imid) while(i2<=ihigh) // if 1st is done, copy the rest from 2nd
{
    arr[ai] = temp[i2];
    i2++;
    ai++;
}

} // merge()



// --------------------------------------------------------
// merge sort algorithm for arrays
template <class C>
void sort_merge(C* arr, int ilow, int ihigh)
{

if(ilow < ihigh)
{
    int imid = (ilow+ihigh)/2; // get middle point
    sort_merge(arr, ilow,   imid); // do 1st half
    sort_merge(arr, imid+1, ihigh); // do 2nd half
    merge(arr, ilow, imid, ihigh); // merge 1st and 2nd halves
}

return;
} // sort_merge()



///////////////////////////////////////////////////////////
int main(int argc, char *argv[])
{
// crashes at 17353 on Intel Xeon, and at 2069 on AMD A10, both 16Gb of ram
const int N=17352+0;
float arr[N]; // with arr[double] crashes sooner, with arr[int] crashes later

// fill array
for(long int i=0; i<N; i++)
{
    //arr[i] = rand()*1.0/RAND_MAX; // random
    arr[i] = sin(i*10)+cos(i*10); // partially sorted
    //arr[i] = i; // sorted
    //arr[i] = -i; // reversed
}

sort_merge(arr, 0, N-1);

return 0;
}

Consider the way you are copying the array: 考虑复制阵列的方式:

vector<C> temp; // array seg faults earlier than vector
for(int i=ilow; i<=ihigh; i++) temp.push_back(arr[i]); // copy array

When this completes, temp contains ihigh - ilow + 1 values, which are accessible from temp[0] to temp[ihigh - ilow] . 完成此操作后, temp包含ihigh - ilow + 1值,可以从temp[0]temp[ihigh - ilow] This means all values in temp are offset by -ilow compared to arr . 这意味着与arr相比, temp中的所有值都偏移-ilow

However the rest of your code accesses temp with the indices of the source array, for example: 但是,其余代码将使用数组的索引访问temp ,例如:

if(temp[i1]<=arr[i2]) // i1 isn't a valid index into temp, should be (i1 - ilow)

Hence the crash. 因此崩溃。 When using the proper offset into temp your code seems to work correctly . temp使用适当的偏移量时,您的代码似乎可以正常工作

The following, all by itself, is enough to cause a segmentation fault if N is large enough, due to the notorious stack overflow . 如果N足够大,那么由于臭名昭著的堆栈溢出 ,以下所有这些本身就足以引起分段错误。 If not, filling the array should. 如果不是,则应填充数组。

int main(int argc, char *argv[])
{
    // crashes at 17353 on Intel Xeon, and at 2069 on AMD A10, both 16Gb of ram
    const int N=17352+0;
    float arr[N];
}

The reason is that local variables tend to be allocated on the stack, but the stack is limited in size and not designed for massive memory allocations. 原因是局部变量倾向于在堆栈上分配,但是堆栈的大小受到限制,并且不是为大量内存分配而设计的。 If you instead did 如果您改为

float *arr = new arr[N]; // probably should be unique_ptr instead....

or 要么

std::vector<float> arr(N);

you wouldn't have any problems, since both of these methods allocate memory on the heap. 您不会有任何问题,因为这两种方法都在堆上分配内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM