簡體   English   中英

逐行讀取文件並將行存儲在 C 中的字符串數組中

[英]Read file line by line and store lines in array of strings in C

我在讀取 c 中的文件並存儲在字符串數組中時遇到問題

char **aLineToMatch;
FILE *file2; 
int bufferLength = 255;
char buffer[bufferLength];
int i;

char *testFiles[] = { "L_2005149PL.01002201.xml.html",
                       "L_2007319PL.01000101.xml.html",
                       NULL};

char *testStrings[] = { "First",
                         "Second",
                         "Third.",
                          NULL};


file = fopen(testFiles[0], "r"); // loop will come later, thats not the problem

while(fgets(buffer, bufferLength, file2) != NULL) {
 printf("%s\n", buffer);
 // here should be adding to array of strings (testStrings declared above)
} 
 fclose(file);
}

然后我做一些檢查,一些打印等。

for(aLineToMatch=testStrings; *aLineToMatch != NULL; aLineToMatch++) {
    printf("String: %s\n", *aLineToMatch);

如何正確更改*testFiles[]的值以包含從文件讀取的有效值並在末尾添加 NULL?

我認為這里的關鍵問題是在 C 中您必須管理自己的 memory,並且您需要了解 C 中可用的不同類型存儲之間的區別。

簡單地說,有:

  1. Static

以下是一些相關鏈接,其中包含更多詳細信息:

https://www.geeksforgeeks.org/memory-layout-of-c-program/

https://craftofcoding.wordpress.com/2015/12/07/memory-in-c-the-stack-the-heap-and-static/

無論如何,在高級語言中,一切都在堆上,因此您幾乎可以隨心所欲地操縱它。 但是,沼澤標准 arrays 和 C 中的字符串具有固定大小的 static 存儲。

這個答案的 rest 在下面的代碼注釋中。

我已經修改了你的代碼,並試圖解釋為什么需要它。

// @Compile gcc read_line_by_line.c && ./a.out
// @Compile gcc read_line_by_line.c && valgrind ./a.out
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>

// When declaring an array, the size of the array must be a compile-time constant
// i.e. it cannot be a dynamic variable like this: int n = 3; int numbers[n];
#define BUFFER_SIZE_BYTES 255

// Uses static program storage, size is fixed at time of compilation
char *files[] = {"file1.txt", "file2.txt"}; // The size of this symbol is (sizeof(char*) * 2)
// Hence this line of code is valid even outside the body of a function
// because it doesn't actually execute,
// it just declares some memory that the compiler is supposed to provision in the resulting binary executable

// Divide the total size, by the size of an element, to calculate the number of elements
const int num_files = sizeof(files) / sizeof(files[0]);

int main() {
  printf("Program start\n\n");

  printf("There are %d files to read.\n", num_files);

  // These lines are in the body of a function and they execute at runtime
  // This means we are now allocating memory 'on-the-fly' at runtime
  int num_lines = 3;
  char **lines = malloc(sizeof(lines[0]) * num_lines);

  // lines[0] = "First"; // This would assign a pointer to some static storage containing the bytes { 'F', 'i', 'r', 's', 't', '\0' }
  lines[0] = strdup("First");  // Use strdup() instead to allocate a copy of the string on the heap
  lines[1] = strdup("Second"); // This is so that we don't end up with a mixture of strings
  lines[2] = strdup("Third");  // with different kinds of storage in the same array
  // because only the heap strings can be free()'d
  // and trying to free() static strings is an error
  // but you won't be able to tell them apart,
  // they will all just look like pointers
  // and you won't know which ones are safe to free()

  printf("There are %d lines in the array.\n", num_lines);

  // Reading the files this way only works for lines shorter than 255 characters
  /*
  printf("\nReading file...\n");
  FILE *fp = fopen(files[0], "r");
  char buffer[BUFFER_SIZE_BYTES];
  while (fgets(buffer, BUFFER_SIZE_BYTES, fp) != NULL) {
    printf("%s\n", buffer);

    // Resize the array we allocated on the heap
    void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
    // Note that this can fail if there isn't enough free memory available
    // This is also a comparatively expensive operation
    // so you wouldn't typically do a resize for every single line
    // Normally you would allocate extra space, wait for it to run out, then reallocate
    // Either growing by a fixed size, or even doubling the size, each time it gets full

    // Check if the allocation was successful
    if (ptr == NULL) {
      fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
      assert(false);
    }
    // Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
    lines = ptr;

    // We cannot simply lines[num_lines] = buffer
    // because we will end up with an array full of pointers
    // that are all pointing to `buffer`
    // and in the next iteration of the loop
    // we will overwrite the contents of `buffer`
    // so all appended strings will be the same: the last line of the file

    // So we strdup() to allocate a copy on the heap
    // we must remember to free() this later
    lines[num_lines] = strdup(buffer);

    // Keep track of the size of the array
    num_lines++;
  }
  fclose(fp);
  printf("Done.\n");
  */

  // I would recommend reading the file this way instead
  ///*
  printf("\nReading file...\n");
  FILE *fp = fopen(files[0], "r");
  char *new_line = NULL; // This string is allocated for us by getline() and could be any length, we must free() it though afterwards
  size_t str_len = 0;    // This will store the length of the string (including null-terminator)
  ssize_t bytes_read; // This will store the bytes read from the file (excluding null-terminator), or -1 on error (i.e. end-of-file reached)
  while ((bytes_read = getline(&new_line, &str_len, fp)) != -1) {
    printf("%s\n", new_line);

    // Resize the array we allocated on the heap
    void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
    // Note that this can fail if there isn't enough free memory available
    // This is also a comparatively expensive operation
    // so you wouldn't typically do a resize for every single line
    // Normally you would allocate extra space, wait for it to run out, then reallocate
    // Either growing by a fixed size, or even doubling the size, each time it gets full

    // Check if the allocation was successful
    if (ptr == NULL) {
      fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
      assert(false);
    }
    // Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
    lines = ptr;

    // Allocate a copy on the heap
    // so that the array elements don't all point to the same buffer
    // we must remember to free() this later
    lines[num_lines] = strdup(new_line);

    // Keep track of the size of the array
    num_lines++;
  }
  free(new_line); // Free the buffer that was allocated by getline()
  fclose(fp);     // Close the file since we're done with it
  printf("Done.\n");
  //*/

  printf("\nThere are %d lines in the array:\n", num_lines);
  for (int i = 0; i < num_lines; i++) {
    printf("%d: \"%s\"\n", i, lines[i]);
  }

  // Here you can do what you need to with the data...

  // free() each string
  // We know they're all allocated on the heap
  // because we made copies of the statically allocated strings
  for (int i = 0; i < num_lines; i++) {
    free(lines[i]);
  }

  // free() the array itself
  free(lines);

  printf("\nProgram end.\n");
  // At this point we should have free()'d everything that we allocated
  // If you run the program with Valgrind, you should get the magic words:
  // "All heap blocks were freed -- no leaks are possible"
  return 0;
}

如果要向數組添加元素,有 3 個選項:

  1. 在編譯時確定最大元素數並創建正確大小的數組
  2. 在運行時確定最大元素數並創建一個可變長度數組(僅適用於 C99 及更高版本)
  3. 動態分配數組並根據需要展開

選項 1 在這里不起作用,因為在編譯時不可能知道你的文件有多少行。

選項 2 意味着您首先找到行數,這意味着迭代文件兩次。 這也意味着當你從讀取文件的 function 返回時,數組會自動釋放。

選項 3 是最好的。 這是一個例子:

char **aLineToMatch;
FILE *file2; 
int bufferLength = 255;
char buffer[bufferLength];
int i = 0;

char *testFiles[] = { "L_2005149PL.01002201.xml.html",
                       "L_2007319PL.01000101.xml.html",
                       NULL};

char (*testStrings)[bufferLength] = NULL; //pointer to an array of strings

//you probably meant file2 here (or the normal file in the while condition)
file2 = fopen(testFiles[0], "r"); // loop will come later, thats not the problem

while(fgets(buffer, bufferLength, file2) != NULL) {
 printf("%s\n", buffer);
 testStrings = realloc(testStrings, (i + 1) * sizeof testStrings[0]);
 strcpy(testStrings[i], buffer);
 i++;
} 
 fclose(file);
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM