[英]Read file line by line and store lines in array of strings in C
我在讀取 c 中的文件並存儲在字符串數組中時遇到問題
char **aLineToMatch;
FILE *file2;
int bufferLength = 255;
char buffer[bufferLength];
int i;
char *testFiles[] = { "L_2005149PL.01002201.xml.html",
"L_2007319PL.01000101.xml.html",
NULL};
char *testStrings[] = { "First",
"Second",
"Third.",
NULL};
file = fopen(testFiles[0], "r"); // loop will come later, thats not the problem
while(fgets(buffer, bufferLength, file2) != NULL) {
printf("%s\n", buffer);
// here should be adding to array of strings (testStrings declared above)
}
fclose(file);
}
然后我做一些檢查,一些打印等。
for(aLineToMatch=testStrings; *aLineToMatch != NULL; aLineToMatch++) {
printf("String: %s\n", *aLineToMatch);
如何正確更改*testFiles[]
的值以包含從文件讀取的有效值並在末尾添加 NULL?
我認為這里的關鍵問題是在 C 中您必須管理自己的 memory,並且您需要了解 C 中可用的不同類型存儲之間的區別。
簡單地說,有:
以下是一些相關鏈接,其中包含更多詳細信息:
https://www.geeksforgeeks.org/memory-layout-of-c-program/
https://craftofcoding.wordpress.com/2015/12/07/memory-in-c-the-stack-the-heap-and-static/
無論如何,在高級語言中,一切都在堆上,因此您幾乎可以隨心所欲地操縱它。 但是,沼澤標准 arrays 和 C 中的字符串具有固定大小的 static 存儲。
這個答案的 rest 在下面的代碼注釋中。
我已經修改了你的代碼,並試圖解釋為什么需要它。
// @Compile gcc read_line_by_line.c && ./a.out
// @Compile gcc read_line_by_line.c && valgrind ./a.out
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>
// When declaring an array, the size of the array must be a compile-time constant
// i.e. it cannot be a dynamic variable like this: int n = 3; int numbers[n];
#define BUFFER_SIZE_BYTES 255
// Uses static program storage, size is fixed at time of compilation
char *files[] = {"file1.txt", "file2.txt"}; // The size of this symbol is (sizeof(char*) * 2)
// Hence this line of code is valid even outside the body of a function
// because it doesn't actually execute,
// it just declares some memory that the compiler is supposed to provision in the resulting binary executable
// Divide the total size, by the size of an element, to calculate the number of elements
const int num_files = sizeof(files) / sizeof(files[0]);
int main() {
printf("Program start\n\n");
printf("There are %d files to read.\n", num_files);
// These lines are in the body of a function and they execute at runtime
// This means we are now allocating memory 'on-the-fly' at runtime
int num_lines = 3;
char **lines = malloc(sizeof(lines[0]) * num_lines);
// lines[0] = "First"; // This would assign a pointer to some static storage containing the bytes { 'F', 'i', 'r', 's', 't', '\0' }
lines[0] = strdup("First"); // Use strdup() instead to allocate a copy of the string on the heap
lines[1] = strdup("Second"); // This is so that we don't end up with a mixture of strings
lines[2] = strdup("Third"); // with different kinds of storage in the same array
// because only the heap strings can be free()'d
// and trying to free() static strings is an error
// but you won't be able to tell them apart,
// they will all just look like pointers
// and you won't know which ones are safe to free()
printf("There are %d lines in the array.\n", num_lines);
// Reading the files this way only works for lines shorter than 255 characters
/*
printf("\nReading file...\n");
FILE *fp = fopen(files[0], "r");
char buffer[BUFFER_SIZE_BYTES];
while (fgets(buffer, BUFFER_SIZE_BYTES, fp) != NULL) {
printf("%s\n", buffer);
// Resize the array we allocated on the heap
void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
// Note that this can fail if there isn't enough free memory available
// This is also a comparatively expensive operation
// so you wouldn't typically do a resize for every single line
// Normally you would allocate extra space, wait for it to run out, then reallocate
// Either growing by a fixed size, or even doubling the size, each time it gets full
// Check if the allocation was successful
if (ptr == NULL) {
fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
assert(false);
}
// Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
lines = ptr;
// We cannot simply lines[num_lines] = buffer
// because we will end up with an array full of pointers
// that are all pointing to `buffer`
// and in the next iteration of the loop
// we will overwrite the contents of `buffer`
// so all appended strings will be the same: the last line of the file
// So we strdup() to allocate a copy on the heap
// we must remember to free() this later
lines[num_lines] = strdup(buffer);
// Keep track of the size of the array
num_lines++;
}
fclose(fp);
printf("Done.\n");
*/
// I would recommend reading the file this way instead
///*
printf("\nReading file...\n");
FILE *fp = fopen(files[0], "r");
char *new_line = NULL; // This string is allocated for us by getline() and could be any length, we must free() it though afterwards
size_t str_len = 0; // This will store the length of the string (including null-terminator)
ssize_t bytes_read; // This will store the bytes read from the file (excluding null-terminator), or -1 on error (i.e. end-of-file reached)
while ((bytes_read = getline(&new_line, &str_len, fp)) != -1) {
printf("%s\n", new_line);
// Resize the array we allocated on the heap
void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
// Note that this can fail if there isn't enough free memory available
// This is also a comparatively expensive operation
// so you wouldn't typically do a resize for every single line
// Normally you would allocate extra space, wait for it to run out, then reallocate
// Either growing by a fixed size, or even doubling the size, each time it gets full
// Check if the allocation was successful
if (ptr == NULL) {
fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
assert(false);
}
// Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
lines = ptr;
// Allocate a copy on the heap
// so that the array elements don't all point to the same buffer
// we must remember to free() this later
lines[num_lines] = strdup(new_line);
// Keep track of the size of the array
num_lines++;
}
free(new_line); // Free the buffer that was allocated by getline()
fclose(fp); // Close the file since we're done with it
printf("Done.\n");
//*/
printf("\nThere are %d lines in the array:\n", num_lines);
for (int i = 0; i < num_lines; i++) {
printf("%d: \"%s\"\n", i, lines[i]);
}
// Here you can do what you need to with the data...
// free() each string
// We know they're all allocated on the heap
// because we made copies of the statically allocated strings
for (int i = 0; i < num_lines; i++) {
free(lines[i]);
}
// free() the array itself
free(lines);
printf("\nProgram end.\n");
// At this point we should have free()'d everything that we allocated
// If you run the program with Valgrind, you should get the magic words:
// "All heap blocks were freed -- no leaks are possible"
return 0;
}
如果要向數組添加元素,有 3 個選項:
選項 1 在這里不起作用,因為在編譯時不可能知道你的文件有多少行。
選項 2 意味着您首先找到行數,這意味着迭代文件兩次。 這也意味着當你從讀取文件的 function 返回時,數組會自動釋放。
選項 3 是最好的。 這是一個例子:
char **aLineToMatch;
FILE *file2;
int bufferLength = 255;
char buffer[bufferLength];
int i = 0;
char *testFiles[] = { "L_2005149PL.01002201.xml.html",
"L_2007319PL.01000101.xml.html",
NULL};
char (*testStrings)[bufferLength] = NULL; //pointer to an array of strings
//you probably meant file2 here (or the normal file in the while condition)
file2 = fopen(testFiles[0], "r"); // loop will come later, thats not the problem
while(fgets(buffer, bufferLength, file2) != NULL) {
printf("%s\n", buffer);
testStrings = realloc(testStrings, (i + 1) * sizeof testStrings[0]);
strcpy(testStrings[i], buffer);
i++;
}
fclose(file);
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.