简体   繁体   English

如何修复“realloc():无效指针”

[英]How to fix “realloc(): invalid pointer”

I am trying to write a function to convert a text file into a CSV file. 我正在尝试编写一个函数来将文本文件转换为CSV文件。 The input file has 3 lines with space-delimited entries. 输入文件有3行,以空格分隔的条目。 I have to find a way to read a line into a string and transform the three lines from the input file to three columns in a CSV file. 我必须找到一种方法将一行读入一个字符串,并将输入文件中的三行转换为CSV文件中的三列。

The files look like this : 文件看起来像这样:

Jake Ali Maria
24 23 43
Montreal Johannesburg Sydney

And I have to transform it into something like this: 我必须把它变成这样的东西:

Jake, 24, Montreal
...etc

I figured I could create a char **line variable that would hold three references to three separate char arrays, one for each of the three lines of the input file. 我想我可以创建一个char **line变量,它将对三个独立的char数组进行三次引用,一个用于输入文件的三行中的每一行。 Ie, my goal is to have *(line+i) store the i+1 'th line of the file. 即,我的目标是让*(line+i)存储文件的第i+1行。

I wanted to avoid hardcoding char array sizes, such as 我想避免硬编码char数组大小,例如

char line1 [999]; 
fgets(line1, 999, file);

so I wrote a while loop to fgets pieces of a line into a small buffer array of predetermined size, and then strcat and realloc memory as necessary to store the line as a string, with *(line+i) as as pointer to the string, where i is 0 for the first line, 1 for the second, etc. 所以我写了一个while循环来将一行的fgets分成预定大小的小缓冲区数组,然后根据需要使用strcatrealloc内存将该行存储为字符串,使用*(line+i)作为指向字符串的指针,其中i0为第一行, 1为第二等

Here is the problematic code: 这是有问题的代码:

#include <stdio.h>
#include<stdlib.h>
#include<string.h>

#define CHUNK 10

char** getLines (const char * filename){
    FILE *file = fopen(filename, "rt");
    char **lines = (char ** ) calloc(3, sizeof(char*));
    char buffer[CHUNK];
    for(int i = 0; i < 3; i++){
        int lineLength = 0;
        int bufferLength = 0;
        *(lines+i) = NULL;
        do{
            fgets(buffer, CHUNK, file);
            buffLength = strlen(buffer);
            lineLength += buffLength;
            *(lines+i) = (char*) realloc(*(lines+i), (lineLength +1)*sizeof(char));
            strcat(*(lines+i), buffer);
        }while(bufferLength ==CHUNK-1);
    }
    puts(*(lines+0));
    puts(*(lines+1));
    puts(*(lines+2));

    fclose(file);
}

void load_and_convert(const char* filename){
    char ** lines = getLines(filename);
}

int main(){
    const char* filename = "demo.txt";
    load_and_convert(filename);
}

This works as expected only for i=0 . 这仅适用于i=0 However, going through this with GDB, I see that I get a realloc(): invalid pointer error. 但是,通过GDB,我看到我得到一个realloc(): invalid pointer错误。 The buffer loads fine, and it only crashes when I call 'realloc' in the for loop for i=1 , when I get to the second line. 缓冲区加载正常,当我在i=1的for循环中调用'realloc'时,它只会崩溃,当我到达第二行时。

I managed to store the strings like I wanted in a small example I did to try to see what was going on, but the inputs were all on the same line. 我设法按照我想要的方式存储字符串,我试图查看发生了什么,但输入都在同一行。 Maybe this has to do with fgets reading from a new line? 也许这与读取新线路的fgets

I would really appreciate some help with this, I've been stuck all day. 我真的很感激这方面的帮助,我整天都被困住了。

Thanks a lot! 非常感谢!

***edit ***编辑

I tried as suggested to use calloc instead of malloc to initialize the variable **lines , but I still have the same issue.I have added the modifications to the original code I uploaded. 我尝试使用calloc而不是malloc初始化变量**lines ,但我仍然有相同的问题。我已经修改了我上传的原始代码。

***edit ***编辑

After deleting the file and recompiling, the above now seems to work. 删除文件并重新编译后,上面现在似乎工作。 Thank you to everyone for helping me out! 感谢大家帮助我!

You allocate line (which is a misnomer since it's not a single line), which is a pointer to three char* s. 你分配line (这是一个用词不当,因为它不是一行),这是一个指向三个char*的指针。 You never initialize the contents of line (that is, you never make any of those three char* s point anywhere). 你永远不会初始化line的内容(也就是说,你永远不会在任何地方创建这三个char* s中的任何一个)。 Consequently, when you do realloc(*(line + i), ...) , the first argument is uninitialized garbage. 因此,当你执行realloc(*(line + i), ...) ,第一个参数是未初始化的垃圾。

To use realloc to do an initial memory allocation, its first argument must be a null pointer. 要使用realloc进行初始内存分配,其第一个参数必须是空指针。 You should explicitly initialize each element of line to NULL first. 您应该首先将line每个元素显式初始化为NULL

Additionally, *(line+i) = (char *)realloc(*(line+i), ...) is still bad because if realloc fails to allocate memory, it will return a null pointer, clobber *(line + i) , and leak the old pointer. 另外, *(line+i) = (char *)realloc(*(line+i), ...)仍然很糟糕,因为如果realloc无法分配内存,它将返回一个空指针,clobber *(line + i) ,泄漏旧指针。 You instead should split it into separate steps: 您应该将其拆分为单独的步骤:

char* p = realloc(line[i], ...);
if (p == null) {
    // Handle failure somehow.
    exit(1);
} 
line[i] = p;

A few more notes: 还有一些说明:

  • In C, you should avoid casting the result of malloc / realloc / calloc . 在C中,您应该避免转换malloc / realloc / calloc的结果。 It's not necessary since C allows implicit conversion from void* to other pointer types, and the explicit could mask an error where you accidentally omit #include <stdlib.h> . 没有必要,因为C允许从void*到其他指针类型的隐式转换,而explicit可以掩盖错误,你不小心省略#include <stdlib.h>
  • sizeof(char) is, by definition, 1 byte. 根据定义, sizeof(char)是1个字节。
  • When you're allocating memory, it's safer to get into a habit of using T* p = malloc(n * sizeof *p); 当你分配内存时,养成使用T* p = malloc(n * sizeof *p);的习惯更安全T* p = malloc(n * sizeof *p); instead of T* p = malloc(n * sizeof (T)); 而不是T* p = malloc(n * sizeof (T)); . That way if the type of p ever changes, you won't silently be allocating the wrong amount of memory if you neglect to update the malloc (or realloc or calloc ) call. 这样,如果p的类型发生变化,如果忽略更新malloc (或realloccalloc )调用,则不会默默地分配错误的内存量。

Here, you have to zero your array of pointers (for example by using calloc() ), 在这里,你必须将指针数组归零(例如使用calloc() ),

char **line = (char**)malloc(sizeof(char*)*3); //allocate space for three char* pointers

otherwise the reallocs 否则reallocs

*(line+i) = (char *)realloc(*(line+i), (inputLength+1)*sizeof(char)); //+1 for the empty character

use an uninitialized pointer, leading to undefined behaviour. 使用未初始化的指针,导致未定义的行为。 That it works with i=0 is pure coindicence and is a typical pitfall when encountering UB. 它与i=0是纯粹的共同点,并且在遇到UB时是一个典型的陷阱。

Furthermore, when using strcat() , you have to make sure that the first parameter is already a zero-terminated string! 此外,在使用strcat() ,您必须确保第一个参数已经是以零结尾的字符串! This is not the case here, since at the first iteration, realloc(NULL, ...); 这不是这种情况,因为在第一次迭代时, realloc(NULL, ...); leaves you with an uninitialized buffer. 给你留下一个未初始化的缓冲区。 This can lead to strcpy() writing past the end of your allocated buffer and lead to heap corruption. 这可能导致strcpy()写入已分配缓冲区的末尾并导致堆损坏。 A possible fix is to use strcpy() instead of strcat() (this should even be more efficient here): 一个可能的解决方法是使用strcpy()而不是strcat() (这在这里应该更有效):

   do{
        fgets(buffer, CHUNK, file);
        buffLength = strlen(buffer);
        lines[i] = realloc(lines[i], (lineLength + buffLength + 1));
        strcpy(lines[i]+lineLength, buffer);
        lineLength += buffLength;
    }while(bufferLength ==CHUNK-1);

The check bufferLength == CHUNK-1 will not do what you want if the line (including the newline) is exactly CHUNK-1 bytes long. 如果行(包括换行符)正好是CHUNK-1字节长,则检查bufferLength == CHUNK-1将不会执行您想要的操作。 A better check might be while (buffer[buffLength-1] != '\\n') . 更好的检查可能是while (buffer[buffLength-1] != '\\n')

Btw. 顺便说一句。 line[i] is by far better readable than *(line+i) (which is semantically identical). line[i]*(line+i) (在语义上相同)更易读。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM