简体   繁体   English

逐行读取文件并将行存储在 C 中的字符串数组中

[英]Read file line by line and store lines in array of strings in C

I have problem reading a file in c and storing in array of strings我在读取 c 中的文件并存储在字符串数组中时遇到问题

char **aLineToMatch;
FILE *file2; 
int bufferLength = 255;
char buffer[bufferLength];
int i;

char *testFiles[] = { "L_2005149PL.01002201.xml.html",
                       "L_2007319PL.01000101.xml.html",
                       NULL};

char *testStrings[] = { "First",
                         "Second",
                         "Third.",
                          NULL};


file = fopen(testFiles[0], "r"); // loop will come later, thats not the problem

while(fgets(buffer, bufferLength, file2) != NULL) {
 printf("%s\n", buffer);
 // here should be adding to array of strings (testStrings declared above)
} 
 fclose(file);
}

and then I do some checks, some prints etc.然后我做一些检查,一些打印等。

for(aLineToMatch=testStrings; *aLineToMatch != NULL; aLineToMatch++) {
    printf("String: %s\n", *aLineToMatch);

How to properly change the values of *testFiles[] to include valid values read from file and add NULL at the end?如何正确更改*testFiles[]的值以包含从文件读取的有效值并在末尾添加 NULL?

I think the key issue here is that in C you must manage your own memory, and you need to know the difference between the different types of storage available in C.我认为这里的关键问题是在 C 中您必须管理自己的 memory,并且您需要了解 C 中可用的不同类型存储之间的区别。

Simply put, there's:简单地说,有:

  1. Stack
  2. Heap
  3. Static Static

Here's some relevant links with more detail about this:以下是一些相关链接,其中包含更多详细信息:

https://www.geeksforgeeks.org/memory-layout-of-c-program/ https://www.geeksforgeeks.org/memory-layout-of-c-program/

https://craftofcoding.wordpress.com/2015/12/07/memory-in-c-the-stack-the-heap-and-static/ https://craftofcoding.wordpress.com/2015/12/07/memory-in-c-the-stack-the-heap-and-static/

In higher-level languages everything is on the heap anyway so you can pretty much manipulate it however you please.无论如何,在高级语言中,一切都在堆上,因此您几乎可以随心所欲地操纵它。 However, bog-standard arrays and strings in C have static storage of a fixed size.但是,沼泽标准 arrays 和 C 中的字符串具有固定大小的 static 存储。

The rest of this answer is in the code comments below.这个答案的 rest 在下面的代码注释中。

I've modified your code and tried to give explanations and context as to why it is needed.我已经修改了你的代码,并试图解释为什么需要它。

// @Compile gcc read_line_by_line.c && ./a.out
// @Compile gcc read_line_by_line.c && valgrind ./a.out
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>

// When declaring an array, the size of the array must be a compile-time constant
// i.e. it cannot be a dynamic variable like this: int n = 3; int numbers[n];
#define BUFFER_SIZE_BYTES 255

// Uses static program storage, size is fixed at time of compilation
char *files[] = {"file1.txt", "file2.txt"}; // The size of this symbol is (sizeof(char*) * 2)
// Hence this line of code is valid even outside the body of a function
// because it doesn't actually execute,
// it just declares some memory that the compiler is supposed to provision in the resulting binary executable

// Divide the total size, by the size of an element, to calculate the number of elements
const int num_files = sizeof(files) / sizeof(files[0]);

int main() {
  printf("Program start\n\n");

  printf("There are %d files to read.\n", num_files);

  // These lines are in the body of a function and they execute at runtime
  // This means we are now allocating memory 'on-the-fly' at runtime
  int num_lines = 3;
  char **lines = malloc(sizeof(lines[0]) * num_lines);

  // lines[0] = "First"; // This would assign a pointer to some static storage containing the bytes { 'F', 'i', 'r', 's', 't', '\0' }
  lines[0] = strdup("First");  // Use strdup() instead to allocate a copy of the string on the heap
  lines[1] = strdup("Second"); // This is so that we don't end up with a mixture of strings
  lines[2] = strdup("Third");  // with different kinds of storage in the same array
  // because only the heap strings can be free()'d
  // and trying to free() static strings is an error
  // but you won't be able to tell them apart,
  // they will all just look like pointers
  // and you won't know which ones are safe to free()

  printf("There are %d lines in the array.\n", num_lines);

  // Reading the files this way only works for lines shorter than 255 characters
  /*
  printf("\nReading file...\n");
  FILE *fp = fopen(files[0], "r");
  char buffer[BUFFER_SIZE_BYTES];
  while (fgets(buffer, BUFFER_SIZE_BYTES, fp) != NULL) {
    printf("%s\n", buffer);

    // Resize the array we allocated on the heap
    void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
    // Note that this can fail if there isn't enough free memory available
    // This is also a comparatively expensive operation
    // so you wouldn't typically do a resize for every single line
    // Normally you would allocate extra space, wait for it to run out, then reallocate
    // Either growing by a fixed size, or even doubling the size, each time it gets full

    // Check if the allocation was successful
    if (ptr == NULL) {
      fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
      assert(false);
    }
    // Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
    lines = ptr;

    // We cannot simply lines[num_lines] = buffer
    // because we will end up with an array full of pointers
    // that are all pointing to `buffer`
    // and in the next iteration of the loop
    // we will overwrite the contents of `buffer`
    // so all appended strings will be the same: the last line of the file

    // So we strdup() to allocate a copy on the heap
    // we must remember to free() this later
    lines[num_lines] = strdup(buffer);

    // Keep track of the size of the array
    num_lines++;
  }
  fclose(fp);
  printf("Done.\n");
  */

  // I would recommend reading the file this way instead
  ///*
  printf("\nReading file...\n");
  FILE *fp = fopen(files[0], "r");
  char *new_line = NULL; // This string is allocated for us by getline() and could be any length, we must free() it though afterwards
  size_t str_len = 0;    // This will store the length of the string (including null-terminator)
  ssize_t bytes_read; // This will store the bytes read from the file (excluding null-terminator), or -1 on error (i.e. end-of-file reached)
  while ((bytes_read = getline(&new_line, &str_len, fp)) != -1) {
    printf("%s\n", new_line);

    // Resize the array we allocated on the heap
    void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
    // Note that this can fail if there isn't enough free memory available
    // This is also a comparatively expensive operation
    // so you wouldn't typically do a resize for every single line
    // Normally you would allocate extra space, wait for it to run out, then reallocate
    // Either growing by a fixed size, or even doubling the size, each time it gets full

    // Check if the allocation was successful
    if (ptr == NULL) {
      fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
      assert(false);
    }
    // Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
    lines = ptr;

    // Allocate a copy on the heap
    // so that the array elements don't all point to the same buffer
    // we must remember to free() this later
    lines[num_lines] = strdup(new_line);

    // Keep track of the size of the array
    num_lines++;
  }
  free(new_line); // Free the buffer that was allocated by getline()
  fclose(fp);     // Close the file since we're done with it
  printf("Done.\n");
  //*/

  printf("\nThere are %d lines in the array:\n", num_lines);
  for (int i = 0; i < num_lines; i++) {
    printf("%d: \"%s\"\n", i, lines[i]);
  }

  // Here you can do what you need to with the data...

  // free() each string
  // We know they're all allocated on the heap
  // because we made copies of the statically allocated strings
  for (int i = 0; i < num_lines; i++) {
    free(lines[i]);
  }

  // free() the array itself
  free(lines);

  printf("\nProgram end.\n");
  // At this point we should have free()'d everything that we allocated
  // If you run the program with Valgrind, you should get the magic words:
  // "All heap blocks were freed -- no leaks are possible"
  return 0;
}

If you want to add elements to an array, you have 3 options:如果要向数组添加元素,有 3 个选项:

  1. Determine the maximum number of elements at compile-time and create a correctly sized arrray在编译时确定最大元素数并创建正确大小的数组
  2. Determine the maximum number of elements at run-time and create a variable-length array (works only in C99 and later)在运行时确定最大元素数并创建一个可变长度数组(仅适用于 C99 及更高版本)
  3. Dynamically allocate the array and expand it as needed动态分配数组并根据需要展开

Option 1 doesn't work here because it is impossible to know at compile-time how many lines your file will have.选项 1 在这里不起作用,因为在编译时不可能知道你的文件有多少行。

Option 2 would imply that you first find the number of lines, which means to iterate the file twice.选项 2 意味着您首先找到行数,这意味着迭代文件两次。 It also means that when you return from the function that reads the file, the array is automatically deallocated.这也意味着当你从读取文件的 function 返回时,数组会自动释放。

Option 3 is the best.选项 3 是最好的。 Here is an example:这是一个例子:

char **aLineToMatch;
FILE *file2; 
int bufferLength = 255;
char buffer[bufferLength];
int i = 0;

char *testFiles[] = { "L_2005149PL.01002201.xml.html",
                       "L_2007319PL.01000101.xml.html",
                       NULL};

char (*testStrings)[bufferLength] = NULL; //pointer to an array of strings

//you probably meant file2 here (or the normal file in the while condition)
file2 = fopen(testFiles[0], "r"); // loop will come later, thats not the problem

while(fgets(buffer, bufferLength, file2) != NULL) {
 printf("%s\n", buffer);
 testStrings = realloc(testStrings, (i + 1) * sizeof testStrings[0]);
 strcpy(testStrings[i], buffer);
 i++;
} 
 fclose(file);
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 C将文件逐行读取到字符串数组中并进行排序 - C read file(s) line-by-line into array of Strings and sort 逐行读取文件并存储在C中 - read line by line of file and store in C C语言帮助:如何将.txt文件中的字符串存储到字符数组中? 从命令行参数btw读取此.txt文件。 - C language help:How to store strings from a .txt file into a character array? This .txt file is read from a command line argument btw. 从文件中读取行并将每个单词存储到数组(C语言)中 - read line from file and store each word into an array (C language) 无论如何,在C语言中,它们是逐行读取文件内容并将每个整数值(预行)分别存储到数组中的方式吗? - In C is their anyway way to read the contents of a file, line-by-line, and store each integer value (preline) into a array separately? 文件到字符串数组(逐行) - File to array of strings (line by line) 读取文本文件,然后将行存储在数组中以过滤掉某些字符串 - Read text file, then store lines in array to filter out certain strings 如何从.txt文件中读取已知数量的未知大小的字符串,并将每一行存储在矩阵的一行中(用C表示)? - How can I read a known number of strings of unknown size from a .txt file and store each line in a line of a matrix (in C)? 将文件逐行读取到 C 中的字符串数组中 - Reading a file line-by-line into an array of strings in C C-无法将文件存储到字符串数组中(基于断行) - C - Can't store file into an array of strings (based off line break)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM