简体   繁体   English

插入排序导致分段错误:缩放后为11

[英]Insertion Sort causes Segmentation fault: 11 after scaling

I wrote a simple insertion sort implementation to try to knock the rust of, and begin what I hope is a better understading of algortihms in general. 我编写了一个简单的插入排序实现,以试图消除锈,并开始希望总体上更好地掌握算法。 The file holds 20 million random numbers. 该文件包含2000万个随机数。 The code is below: 代码如下:

#include <fstream>
#include <time.h>
#include <cstdlib>

using namespace std;

void insertionSort(double numbers[], int array_size);

int main() {
    int count;
    double twentyMMNumbers[20000000];
    ifstream inputFile;
    time_t now;
    time(&now);

    inputFile.open("20KRandomsNumbers.data");       //Opens input file
    if(inputFile.fail())
    {
        printf("Cannot open inputFile");
        exit(1);
    }

    count = 0;
    printf("%d\n",count);
    inputFile >> twentyMMNumbers[count];
    printf("%f\n",twentyMMNumbers[count]);
    while(inputFile)
    {   //While loop
        count++;
        if(count < 20000000)
            inputFile >> twentyMMNumbers[count];
    }
    inputFile.close();  

    printf("%s\n", ctime(&now));    //BEFORE
    insertionSort(twentyMMNumbers, 20000000); //Insertion Sort 20KRandomNumbers
    printf("%s\n", ctime(&now)); //AFTER
    for(int i = 0; i < count; i++)
        printf("%f\n",twentyMMNumbers[i]);  
}

void insertionSort(double numbers[], int array_size)
{
  int i, j, index;
  for (i=1; i < array_size; i++)
  {
    index = numbers[i];
    j = i;
    while ((j > 0) && (numbers[j-1] > index))
    {
      numbers[j] = numbers[j-1];
      j = j - 1;
    }
    numbers[j] = index;
  }
}

The code worked fine when it only had 20,000 entries, but now gives me: 当只有20,000个条目时,该代码可以正常工作,但是现在给了我:

Segmentation fault: 11

Was this caused by my increase in the size of the array? 这是因为我增加了数组的大小吗? PS if you have any tips on optimizing this, feels free to point it out. PS,如果您有优化的技巧,请随时指出。

Ironically enough (given this site name), you have a stack overflow. 具有讽刺意味的是(给定此站点名称),您将出现堆栈溢出。 You'll need to dynamically allocate that much memory on the heap. 您需要在堆上动态分配那么多的内存。

For more clarity, this line: 为了更清楚,此行:

double twentyMMNumbers[20000000];

needs to be 需要是

double* twentyMMNumbers = (double*)malloc(20000000*sizeof(double));

And of course, you'll need to free that memory before you exit your program (as a best practice): 当然,您需要在退出程序之前释放该内存(作为最佳实践):

free(twentyMMNumbers);

It works with smaller array size so here are some comments (since it was on CodeReview) and some fixes for bigger array sizes. 它适用于较小的数组大小,因此这里有一些注释(因为它在CodeReview上)和一些针对较大数组大小的修复程序。

1, If you want to use fixed size arrays use a constant instead of the 200000 (or 20000000 ) literals. 1,如果要使用固定大小的数组,请使用常量而不是200000 (或20000000 )文字。

2, Using dynamic arrays is more better. 2,使用动态数组更好。 Allocate the memory after you have read the first line and use the readed size as the size of the new array. 读取第一行后分配内存,并将读取的大小用作新数组的大小。 Furthermore I would store the size of the data file (the first line of the file) in a separate variable, not in the array. 此外,我会将数据文件的大小(文件的第一行)存储在单独的变量中,而不是数组中。

int dataSize;
inputFile >> dataSize;
double *twentyMMNumbers = new double[dataSize];

It allocates the exact amount of memory. 它分配确切的内存量。 No more, no less. 不多不少。

It also fixes the Segmentation fault error. 它还修复了细分错误。 For more infomartion check this question: Segmentation fault on large array sizes 有关更多信息市场,请检查以下问题: 大阵列大小上的分段错误

(Don't forget to unallocate the array with delete[] .) (不要忘记使用delete[]取消分配数组。)

3, It's unnecessary to read the whole file if you have more records than the size of the array. 3,如果您的记录多于数组的大小,则无需读取整个文件。 I'd modify the while loop: 我会修改while循环:

while (inputFile) {   
    if (count >= dataSize) {

        break;
    }
    inputFile >> twentyMMNumbers[count];
    count++;
}
inputFile.close();  

Maybe an exit(-1) and an error message would be better instead of the break . 也许使用exit(-1)和一条错误消息来代替break会更好。

4, The following comment is unneccesary: 4,以下注释是不必要的:

//While loop

5, You should pass the real size of the array to the insertionSort function, so write this: 5,您应该将数组的实际大小传递给insertionSort函数,因此请编写以下代码:

insertionSort(twentyMMNumbers, dataSize); 

The comment here is also unnecessary. 这里的评论也是不必要的。

6, Improve error handling: what happens when the value of dataSize is bigger than the number of the numbers in the file? 6,改进错误处理:当dataSize的值大于文件中数字的数量时会发生什么?

7, I would extract a printArray function with the last for loop as well as a readInput function. 7,我会逼出printArray与最后一个函数for循环,以及一个readInput功能。

8, Consider using C++ style printing instead of printf s: 8,考虑使用C ++样式打印而不是printf

cout << "Hello world!" << endl;

( #include <iostream> is required.) (必须包含#include <iostream> 。)

You have a problem in your while loop: 您的while循环中存在问题:

while(inputFile)
{   //While loop
    count++;
    if(count < 20000000)
        inputFile >> twentyMMNumbers[count];
}

This loop will only terminate if the file contains <=20000000 numbers. 仅当文件包含<= 20000000个数字时,此循环才会终止。 If count becomes 20000000, it will not try to read anymore from inputFile , but still use inputFile as the condition to continue the loop. 如果count变为20000000,它将不再尝试从inputFile读取inputFile ,但仍将使用inputFile作为继续循环的条件。 And then when count reaches the maximum int value, it will wrap around and you will index twentyMMNumbers with a negative number. 然后,当count达到最大int值时,它将回绕,并且您将使用负数索引twentyMMNumbers数字。 This could very well be the reason you get a segmentation fault. 这很可能是您遇到细分错误的原因。

20000000 is a "magic number" in your code. 20000000是代码中的“魔术数字”。 Instead of writing it every where make a const int NumElements = 20000000 at the beginning of your file. 而不是在每个文件的开头都写一个const int NumElements = 20000000地方编写它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM