简体   繁体   中英

Segmentation fault while reading data from file

I'm getting a segmentation fault error while parsing data from a CSV file in C Language. I believe the error is given while reading the last line <person[i].status> as if i comment the same line the code runs perfectly.

Contents of CSV file:

1;A;John Mott;D;30;Z
2;B;Judy Moor;S;60;X
3;A;Kae Blanchett;S;42;y
4;B;Jair Tade;S;21;W
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct Person
{
    int id;
    char key;
    char name[16];
    char rel;
    int age;
    char status;
} Person;

int main()
{
    Person person[12];

    FILE *f = fopen("data.csv", "r");
    char buffer[256];

    if (f != NULL)
    {
        int i = 0;
        printf("\nFile OK!\n");
        printf("Printing persons:\n");
        while (fgets(buffer, 256, f))
        {
            person[i].id = atoi(strtok(buffer, ";"));
            person[i].key = strtok(NULL, ";")[0];
            strcpy(person[i].name, strtok(NULL, ";"));
            person[i].rel = strtok(NULL, ";")[0];
            person[i].age = atoi(strtok(buffer, ";"));
            person[i].status = strtok(NULL, ";")[0]; // error: segmentation fault

            printf("id: %d\n", person[i].id);
            printf("key: %c\n", person[i].key);
            printf("name: %s\n", person[i].name);
            printf("rel: %c\n", person[i].rel);
            printf("age: %d\n", person[i].age);
            printf("status: %c\n", person[i].status);
            i++;
        }
    }
    else
    {
        printf("\nFile BAD!\n");
    }

    return 0;
}

Thank you for your help!

While you have a good answer addressing your problems with strtok() , you may be over-complicating your code by using strtok() to begin with. When reading a delimited file with a fixed delimiter, reading a line-at-a-time into a sufficiently sized buffer and then separating the buffer into the needed values with sscanf() can provide a succinct (and in the case of your use of atoi() a more robust) solution.

Your fields are easily separated in this case using a carefully crafted format-string . For example, reading each line into a buffer ( buf in this case) you can separate each of the lines into the needed values with:

        if (sscanf (buf, "%d;%c;%15[^;];%c;%d;%c",      /* convert to person/VALIDATE */
                    &person[n].id, &person[n].key, person[n].name,
                    &person[n].rel, &person[n].age, &person[n].status) == 6)

The conversion to int by sscanf() at least minimally validates the integer conversion. Not so with atoi() which will happily take atoi ("my cow") and fail silently returning zero without any indication things have gone wrong.

Note, in every conversion to string, you must provide a field-width modifier to limit the number of characters stored to one less than your array can hold (saving room for the '\0' nul-terminating character). Otherwise the use of the scanf() family "%s" or "%[..]" is no safer than using gets() . See Why gets() is so dangerous it should never be used!

The same protection of your array bounds for person[] applies on your read loop. Simply keeping a count of the successful conversions and testing before the next read is all you need, eg

#define NPERSONS  12        /* if you need a constant, #define one (or more) */
#define MAXNAME   16
#define MAXC    1024
...
    char buf[MAXC];                                     /* buffer to hold each line */
    size_t n = 0;                                       /* person counter/index */
    Person person[NPERSONS] = {{ .id = 0 }};            /* initialize all elements */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    ...
    while (n < NPERSONS && fgets (buf, MAXC, fp)) {     /* protect array, read line   */
        if (sscanf (buf, "%d;%c;%15[^;];%c;%d;%c",      /* convert to person/VALIDATE */
                    &person[n].id, &person[n].key, person[n].name,
                    &person[n].rel, &person[n].age, &person[n].status) == 6)
            n++;        /* increment count on good conversion */
    }

As shown with the #define s above, don't use MagicNumbers in your code. (eg 12 , 16 ). Instead declare a constant at the top of your code that provides a convenient single-location to change if your limits later need adjustment.

In the same vein, do not hardcode filenames. There is no reason you should have to re-compile your code just to read from a different file. Pass the filename as the first argument to your program (that's what argc and argv are for), or prompt the user and take the filename as input. Above, the code takes the filename as the first argument, or reads from stdin by default if no argument is provided (like most Unix utilities do).

Putting that altogether, you could do something similar to:

#include <stdio.h>

#define NPERSONS  12        /* if you need a constant, #define one (or more) */
#define MAXNAME   16
#define MAXC    1024

typedef struct Person {
    int id;
    char key;
    char name[MAXNAME];
    char rel;
    int age;
    char status;
} Person;

int main (int argc, char **argv) {

    char buf[MAXC];                                     /* buffer to hold each line */
    size_t n = 0;                                       /* person counter/index */
    Person person[NPERSONS] = {{ .id = 0 }};            /* initialize all elements */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    while (n < NPERSONS && fgets (buf, MAXC, fp)) {     /* protect array, read line   */
        if (sscanf (buf, "%d;%c;%15[^;];%c;%d;%c",      /* convert to person/VALIDATE */
                    &person[n].id, &person[n].key, person[n].name,
                    &person[n].rel, &person[n].age, &person[n].status) == 6)
            n++;        /* increment count on good conversion */
    }
    if (fp != stdin)   /* close file if not stdin */
        fclose (fp);
    
    for (size_t i = 0; i < n; i++)                      /* output results */
        printf ("person[%zu]  %3d  %c  %-15s  %c  %3d  %c\n", i,
                person[i].id, person[i].key, person[i].name,
                person[i].rel, person[i].age, person[i].status);
}

( note: you only need one call to printf() to output any contiguous block of output with conversions. If you have no conversions required, use puts() or fputs() if end-of-line control is needed)

Lastly, do not skimp on buffer size . 16 seems horribly short for a name field ( 64 is still pushing it). By using the field-width modifier you are protected against Undefined Behavior due to overwriting your array bounds (the code will simply skip the line), but you should add an else {... } condition to output an error in that case. 16 is sufficient for your example data, but for general use, you would want to adjust that to a larger value.

Example Use/Output

With your sample input in the file named dat/person_id-status.txt , you could do:

$ ./bin/person_id-status dat/person_id-status.txt
person[0]    1  A  John Mott        D   30  Z
person[1]    2  B  Judy Moor        S   60  X
person[2]    3  A  Kae Blanchett    S   42  y
person[3]    4  B  Jair Tade        S   21  W

Those there the main points that struct me looking over your code. (I'm sure I've forgotten to mention one or two more) Look things over and let me know if you have further questions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM