简体   繁体   English

如何对两个字符串(任意长度)使用 fscanf() 并正确动态分配/取消分配 memory

[英]How to use fscanf() for two strings (of ANY length) and dynamically allocate/de-allocate the memory properly

I need to read an input.txt file and print out two separate strings from each line in the file.我需要读取一个 input.txt 文件并从文件中的每一行打印出两个单独的字符串。 I used a while loop and a fscanf function to get each string and ignore blank space between.我使用了一个 while 循环和一个 fscanf function 来获取每个字符串并忽略它们之间的空格。 If the strings in a line of the input file are too long, I get a segmentation fault .如果输入文件的一行中的字符串太长,则会出现分段错误 However, I am also getting a munmap_chunk(): invalid pointer error when I run my executable.但是,当我运行我的可执行文件时,我也会收到一个munmap_chunk(): invalid pointer错误。

If I don't allocate memory for string1 and string2, fscanf doesn't work properly.如果我没有为 string1 和 string2 分配 memory,fscanf 不能正常工作。 I believe fscanf is changing the pointers to string1 and string2, which is causing the munmap_chunk() error.我相信 fscanf 正在更改指向 string1 和 string2 的指针,这导致了 munmap_chunk() 错误。 However, I need to de-allocate the memory I gave string1 and string2 so I don't have memory leaks.但是,我需要取消分配我给 string1 和 string2 的 memory 所以我没有 memory 泄漏。

How do I scan this file for strings ( of ANY length ) and de-allocate the memory properly?如何扫描此文件中的字符串(任意长度)并正确取消分配 memory?

int main(int argc, char *argv[]) 
{
  char *string1;
  char *string2;
  string1 = (char *)malloc(sizeof(string1)); //these strings need memory allocated for the fscanf to function properly
  string2 = (char *)malloc(sizeof(string2)); 
  FILE* file = fopen(argv[1], "r");
  
  while (fscanf(file, "%s %s", string1, string2) != EOF)
    {
      printf("%s %s\n", string1, string2);
    }
  fclose(file);

  //Deallocating memory
  free(string1);
  free(string2);
  return 0;
}

'fscanf' does not change pointers but it can corrupt memory if you do not allocate enough space for your input. 'fscanf' 不会更改指针,但如果您没有为输入分配足够的空间,它可能会损坏 memory。

And you are not allocating the memory correctly: string1 and string2 are pointers, so all you are allocating is a size of a pointer (4 or 8 bytes depending on your system).而且您没有正确分配 memory: string1string2是指针,因此您分配的只是指针的大小(取决于您的系统,4 或 8 个字节)。

If you need to read a line from a file and you do not know the maximum length of the line in advance, you can not use fscanf .如果您需要从文件中读取一行并且事先不知道该行的最大长度,则不能使用fscanf

You need to allocate a starting buffer, say something like:您需要分配一个起始缓冲区,例如:

string1 = malloc(512 * sizeof(char));

Were 512 is an arbitrary but reasonably large length for a line. 512 是一个任意但相当大的线长度。 You then use fread to read one byte at a time in a loop, and check for end of line (usually '\n').然后,您使用fread在循环中一次读取一个字节,并检查行尾(通常是 '\n')。

You must also count how much you read, and if the line is longer than 512 bytes, use realloc to increase the size of your buffer, like so:您还必须计算您阅读了多少,如果该行超过 512 字节,请使用realloc来增加缓冲区的大小,如下所示:

if (bytesRead == (string1Size - 1) && curByte != '\n') {
    string1Size += 512;
    string1 = realloc(string1, string1Size);
}

Here, bytesRead is an int variable counting how many bytes you successfully read so far, and string1Size is also int variable used to track the size of string1 buffer.这里, bytesRead是一个int变量,用于计算到目前为止您成功读取了多少字节,而string1Size也是用于跟踪string1缓冲区大小的int变量。

string1 = (char *)malloc(sizeof(string1)); allocates memory for just 4 or 8 characters because string1 is a char * and that's how big a pointer is.只为 4 或 8 个字符分配 memory 因为 string1 是一个char * ,这就是指针的大小。

To allocate memory for let's say 100 characters you need to do char *string1 = malloc(sizeof(char) * 100) .要为 memory 分配 100 个字符,您需要执行char *string1 = malloc(sizeof(char) * 100)

How do I scan this file for strings (of ANY length) and de-allocate the memory properly?如何扫描此文件中的字符串(任意长度)并正确取消分配 memory?

You can't with fscanf because it mixes reading input with parsing input.您不能使用fscanf ,因为它将读取输入与解析输入混合在一起。 You don't know what's going to be read before you parse it.在解析之前,您不知道将要读取什么。

Instead, read the line into a large buffer where you can examine it.相反,将该行读入一个可以检查的大缓冲区。 Once you know how big the pieces are you can allocate just the right amount of memory and copy to it.一旦你知道碎片有多大,你就可以分配适量的 memory 并复制到它。

Because we are reusing the line buffer, and throwing it away when we're done, we can make it as large as we think we'll ever need.因为我们正在重用行缓冲区,并在完成后将其丢弃,所以我们可以将其设置为我们认为需要的大小。 1024 or 4096 are often good choices. 1024 或 4096 通常是不错的选择。 I like BUFSIZ .我喜欢BUFSIZ

char line[BUFSIZ];
while( fgets(line, sizeof(line), file) ) {
  // now parse line
}

The parsing can be done in various ways.解析可以通过多种方式完成。 A simple one is strtok (STRing TOKenize).一个简单的就是strtok (STRing TOKenize)。 This tokenizes line in place.这将线标记到位。 Copy them to the right amount of memory with strdup .使用strdup将它们复制到适量的 memory 中。

char line[BUFSIZ];
while( fgets(line, sizeof(line), file) ) {
  char words[2];
  int i = 0;

  for(
    char *word = strtok(line, " ");
    word;
    word = strtok(NULL, " ")
  ) {
    words[i] = strdup(word);
    i++;
  }

  printf("%s %s", words[0], words[1]);
  free(words[0]);
  free(words[1]);
}

line and words are allocated on the stack, they will be freed automatically. linewords在堆栈上分配,它们将被自动释放。 But the memory allocated by strdup is on the heap, it needs to be freed.但是strdup分配的memory在堆上,需要释放。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM