简体   繁体   中英

Reading text file into an array of lines in C

Using CI would like to read in the contents of a text file in such a way as to have when all is said and done an array of strings with the nth string representing the nth line of the text file. The lines of the file can be arbitrarily long.

What's an elegant way of accomplishing this? I know of some neat tricks to read a text file directly into a single appropriately sized buffer, but breaking it down into lines makes it trickier (at least as far as I can tell).

Thanks very much!

Breaking it down into lines means parsing the text and replacing all the EOL (by EOL I mean \\n and \\r) characters with 0. In this way you can actually reuse your buffer and store just the beginning of each line into a separate char * array (all by doing only 2 passes).

In this way you could do one read for the whole file size+2 parses which probably would improve performance.

It's possible to read the number of lines in the file (loop fgets), then create a 2-dimensional array with the first dimension being the number of lines+1. Then, just re-read the file into the array.

You'll need to define the length of the elements, though. Or, do a count for the longest line size.

Example code:

inFile = fopen(FILENAME, "r");
lineCount = 0;
while(inputError != EOF) {
    inputError = fscanf(inFile, "%s\n", word);
    lineCount++;
}
fclose(inFile);
  // Above iterates lineCount++ after the EOF to allow for an array
  // that matches the line numbers

char names[lineCount][MAX_LINE];

fopen(FILENAME, "r");
for(i = 1; i < lineCount; i++)
    fscanf(inFile, "%s", names[i]);
fclose(inFile);

you can use this way

#include <stdlib.h> /* exit, malloc, realloc, free */
#include <stdio.h>  /* fopen, fgetc, fputs, fwrite */

struct line_reader {
    /* All members are private. */
    FILE    *f;
    char    *buf;
    size_t   siz;
};

/*
 * Initializes a line reader _lr_ for the stream _f_.
 */
void
lr_init(struct line_reader *lr, FILE *f)
{
    lr->f = f;
    lr->buf = NULL;
    lr->siz = 0;
}

/*
 * Reads the next line. If successful, returns a pointer to the line,
 * and sets *len to the number of characters, at least 1. The result is
 * _not_ a C string; it has no terminating '\0'. The returned pointer
 * remains valid until the next call to next_line() or lr_free() with
 * the same _lr_.
 *
 * next_line() returns NULL at end of file, or if there is an error (on
 * the stream, or with memory allocation).
 */
char *
next_line(struct line_reader *lr, size_t *len)
{
    size_t newsiz;
    int c;
    char *newbuf;

    *len = 0;           /* Start with empty line. */
    for (;;) {
        c = fgetc(lr->f);   /* Read next character. */
        if (ferror(lr->f))
            return NULL;

        if (c == EOF) {
            /*
             * End of file is also end of last line,
        `    * unless this last line would be empty.
             */
            if (*len == 0)
                return NULL;
            else
                return lr->buf;
        } else {
            /* Append c to the buffer. */
            if (*len == lr->siz) {
                /* Need a bigger buffer! */
                newsiz = lr->siz + 4096;
                newbuf = realloc(lr->buf, newsiz);
                if (newbuf == NULL)
                    return NULL;
                lr->buf = newbuf;
                lr->siz = newsiz;
            }
            lr->buf[(*len)++] = c;

            /* '\n' is end of line. */
            if (c == '\n')
                return lr->buf;
        }
    }
}

/*
 * Frees internal memory used by _lr_.
 */
void
lr_free(struct line_reader *lr)
{
    free(lr->buf);
    lr->buf = NULL;
    lr->siz = 0;
}

/*
 * Read a file line by line.
 * http://rosettacode.org/wiki/Read_a_file_line_by_line
 */
int
main()
{
    struct line_reader lr;
    FILE *f;
    size_t len;
    char *line;

    f = fopen("foobar.txt", "r");
    if (f == NULL) {
        perror("foobar.txt");
        exit(1);
    }

    /*
     * This loop reads each line.
     * Remember that line is not a C string.
     * There is no terminating '\0'.
     */
    lr_init(&lr, f);
    while (line = next_line(&lr, &len)) {
        /*
         * Do something with line.
         */
        fputs("LINE: ", stdout);
        fwrite(line, len, 1, stdout);
    }
    if (!feof(f)) {
        perror("next_line");
        exit(1);
    }
    lr_free(&lr);

    return 0;
}

For C (as opposed to C++), you'd probably wind up usingfgets() . However, you might run into issues due to your arbitrary length lines.

Perhaps a Linked List would be the best way to do this? The compiler won't like having an array with no idea how big to make it. With a Linked List you can have a really large text file, and not worry about allocating enough memory to the array.

Unfortunately, I haven't learned how to do linked lists, but maybe somebody else could help you.

If you have a good way to read the whole file into memory, you are almost there. After you've done that you could scan the file twice. Once to count the lines, and once to set the line pointers and replace '\\n' and (and maybe '\\r' if the file is read in Windows binary mode) with '\\0'. In between scans allocate an array of pointers, now that you know how many you need.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM