简体   繁体   中英

Reading a text file into 2 separate arrays of characters (in C)

For a class I have to write a program to read in a text file in the format of:


TAEDQQ
ZHPNIU
CKEWDI
VUXOFC
BPIRGK
NRTBRB
EXIT
THE
QUICK
BROWN
FOX


I'm trying to get the characters into an array of chars, each line being its own array. I'm able to read from the file okay, and this is the code I use to parse the file:


char** getLinesInFile(char *filepath)  
{  
    FILE *file;  
    const char mode = 'r';  
    file = fopen(filepath, &mode);  
    char **textInFile;  

    /* Reads the number of lines in the file. */
    int numLines = 0;
    char charRead = fgetc(file);
    while (charRead != EOF)
    {
        if(charRead == '\n' || charRead == '\r')
        {
            numLines++;
        }
        charRead = fgetc(file);
    }

    fseek(file, 0L, SEEK_SET);
    textInFile = (char**) malloc(sizeof(char*) * numLines);

    /* Sizes the array of text lines. */
    int line = 0;
    int numChars = 1;
    charRead = fgetc(file);
    while (charRead != EOF)
    {
        if(charRead == '\n' || charRead == '\r')
        {
            textInFile[line] = (char*) malloc(sizeof(char) * numChars);
            line++;
            numChars = 0;
        }
        else if(charRead != ' ')
        {
            numChars++;
        }
        charRead = fgetc(file);
    }

    /* Fill the array with the characters */
    fseek(file, 0L, SEEK_SET);
    charRead = fgetc(file);
    line = 0;
    int charNumber = 0;
    while (charRead != EOF)
    {
        if(charRead == '\n' || charRead == '\r')
        {
            line++;
            charNumber = 0;
        }
        else if(charRead != ' ')
        {
            textInFile[line][charNumber] = charRead;
            charNumber++;
        }
        charRead = fgetc(file);
    }

    return textInFile;
}

This is a run of my program:


Welcome to Word search!

Enter the file you would like us to parse:testFile.txt TAEDQQ!ZHPNIU!CKEWDI!VUXOFC!BPIRGK!NRTBRB!EXIT!THE!QUICK!BROWN!FOX Segmentation fault


What's going on? A), why are the exclamation marks there, and B) why do I get a seg fault at the end? The last thing I do in the main is iterate through the array/pointers.

1) In the first part of your program, you are miscounting the number of lines in the file. The actual number of lines in the file is 11, but your program gets 10. You need to start counting from 1, as there will always be at least one line in the file. So change

int numLines = 0;

to

int numLines = 1;

2) In the second part of the program you are miscounting the number of characters on each line. You need to keep your counter initializations the same. At the start of the segment you initialize numChars to 1. In that case you need to reset your counter to 1 after each iteration, so change:

numChars = 0;

to

numChars = 1;

This should provide enough space for all the non-space characters and for the ending NULL terminator. Keep in mind that in C char* strings are always NULL terminated.

3) Your program also does not account for differences in line termination, but under my test environment that is not a problem -- fgetc returns only one character for the line terminator, even though the file is saved with \\r\\n terminators.

4) In the second part of your program, you are also not allocating memory for the very last line. This causes your segfault in the third part of your program when you try to access the unallocated space.

Note how your code only saves lines if they end in \\r or \\n. Guess what, EOF which technically is the line ending for the last line does not qualify. So your second loop does not save the last line into the array.

To fix this, add this after the second part: textInFile[line] = (char*) malloc(sizeof(char) * numChars);

4) In your program output you are seeing those weird exclamation points because you are not NULL terminating your strings. So you need to add the line marked as NULL termination below:

if(charRead == '\n' || charRead == '\r')
{
    textInFile[line][charNumber] = 0; // NULL termination
    line++;
    charNumber = 0;
}

5) Because you are checking for EOF, you have the same problem in your third loop, so you must add this before the return

textInFile[line][charNumber] = 0; // NULL termination

6) I am also getting some headaches because of the whole program structure. You read the same file character by character 3 times! This is extremely slow and inefficient.

Fixed code follows below:

char** getLinesInFile(char *filepath)  
{  
    FILE *file;  
    const char mode = 'r';  
    file = fopen(filepath, &mode);  
    char **textInFile;

    /* Reads the number of lines in the file. */
    int numLines = 1;
    char charRead = fgetc(file);
    while (charRead != EOF)
    {
        if(charRead == '\n' || charRead == '\r')
        {
            numLines++;
        }
        charRead = fgetc(file);
    }

    fseek(file, 0L, SEEK_SET);
    textInFile = (char**) malloc(sizeof(char*) * numLines);

    /* Sizes the array of text lines. */
    int line = 0;
    int numChars = 1;
    charRead = fgetc(file);
    while (charRead != EOF)
    {
        if(charRead == '\n' || charRead == '\r')
        {
            textInFile[line] = (char*) malloc(sizeof(char) * numChars);
            line++;
            numChars = 1;
        }
        else if(charRead != ' ')
        {
            numChars++;
        }
        charRead = fgetc(file);
    }
textInFile[line] = (char*) malloc(sizeof(char) * numChars);

    /* Fill the array with the characters */
    fseek(file, 0L, SEEK_SET);
    charRead = fgetc(file);
    line = 0;
    int charNumber = 0;
    while (charRead != EOF)
    {
        if(charRead == '\n' || charRead == '\r')
        {
            textInFile[line][charNumber] = 0; // NULL termination
            line++;
            charNumber = 0;
        }
        else if(charRead != ' ')
        {
            textInFile[line][charNumber] = charRead;
            charNumber++;
        }
        charRead = fgetc(file);
    }
    textInFile[line][charNumber] = 0; // NULL termination

    return textInFile;
}

You aren't null terminating your arrays. This probably explains both problems. Be sure to allocate an extra character for the null terminator.

Do This:

if(charRead == '\n')
    {
        textInFile[line] = (char*) malloc(sizeof(char) * (numChars+1));
        line++;
        numChars = 0;
    }

Then:

 if(charRead == '\n')
    {
        textInFile[line][charNumber]='\0';
        line++;
        charNumber = 0;
    }

Also you are reading the file 3 times! This thread has some good explanation on how to read a file efficiently.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM