將 memory 動態分配給數組並讀取大文本文件

Question

我看過其他一些類似的問題和例子，但我很難過。 我的目標是打開一個非常大的文本文件（小說大小），將 memory 分配給一個數組，然后將文本存儲到該數組中，以便將來進行進一步處理。

這是我當前的代碼：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define LINELEN 74

int main(void) {

FILE *file;
char filename[] = "large.txt";
int count = 0, i = 0, len;

/* Open the file */
  file = fopen(filename, "r");
  if (file == NULL) {
      printf("Cannot open file");
      return -1;
  }
    
/* Get size of file for memory allocation */
    fseek(file, 0, SEEK_END);
    long size = ftell(file);
    fseek(file, 0, SEEK_SET);
    
/* Allocate memory to the array */
  char *text_array = (char*)malloc(size*sizeof(char));
    
/* Store the information into the array */
    while(fgets(&text_array[count], LINELEN, file) != NULL) {
      count++;
      }

  len = sizeof(text_array) / sizeof(text_array[0]);

  while(i<len) {
    /* printf("%s", text_array); */
    i++;
  }
  printf("%s", text_array);

/* return array */
    return EXIT_SUCCESS;
}

我期待從底部的 text_array 打印大量文本。 相反，我得到的隨機字符亂碼比我希望的正文要小得多。 我究竟做錯了什么？ 我懷疑這與我的 memory 分配有關，但不知道是什么。

任何幫助深表感謝。

Answer 1

無需在循環中調用fgets() 。 您知道文件有多大，只需一次調用即可將整個內容讀入text_array ：

fread(text_array, 1, size, file);

但是，如果要將text_array視為字符串，則需要添加 null 終止符。 所以你應該在調用malloc()時加 1。

另一個問題是len = sizeof(text_array) / sizeof(text_array[0]) 。 text_array是一個指針，而不是一個數組，所以你不能使用sizeof來獲取它使用的空間量。 但是您不需要這樣做，因為您已經在size變量中有空間。

無需循環打印text_array 。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define LINELEN 74

int main(void) {

    FILE *file;
    char filename[] = "large.txt";
    int count = 0, i = 0, len;

/* Open the file */
    file = fopen(filename, "r");
    if (file == NULL) {
        printf("Cannot open file");
        return -1;
    }
    
/* Get size of file for memory allocation */
    fseek(file, 0, SEEK_END);
    size_t size = ftell(file);
    fseek(file, 0, SEEK_SET);
    
/* Allocate memory to the array */
    char *text_array = (char*)malloc(size*sizeof(char) + 1);
    
/* Store the information into the array */
    fread(text_array, 1, size, file);
    text_array[size] = '\0';
    printf("%s, text_array);

/* return array */
    return EXIT_SUCCESS;
}

Answer 2

這部分

while(fgets(&text_array[count], LINELEN, file) != NULL) {
  count++;
}

是有問題的。

如果循環未展開，則“有點像”：

fgets(&text_array[0], LINELEN, file)
fgets(&text_array[1], LINELEN, file)
fgets(&text_array[2], LINELEN, file)

因此，您只需在每個fgets調用之間將fgets目標緩沖區推進一個字符即可。 如果我們假設fgets讀取多個字符，則第二個fgets會覆蓋來自第一個fgets的數據。 第三個fgets覆蓋來自第二個的數據，依此類推。

您需要使用與fgets實際讀取的字符一樣多的字符來推進緩沖區，或者使用另一種讀取方式，例如fread 。

將 memory 動態分配給數組並讀取大文本文件

問題描述

2 個解決方案

解決方案1
2 已采納 2020-08-11 07:10:15

解決方案2
1 2020-08-11 07:07:13

將 memory 動態分配給數組並讀取大文本文件

問題描述

2 個解決方案

解決方案1 2 已采納 2020-08-11 07:10:15

解決方案2 1 2020-08-11 07:07:13

解決方案1
2 已采納 2020-08-11 07:10:15

解決方案2
1 2020-08-11 07:07:13