简体   繁体   English

从文件读取数据时出现分段错误

[英]Segmentation fault while reading data from file

I'm getting a segmentation fault error while parsing data from a CSV file in C Language.我在解析 C 语言的 CSV 文件中的数据时遇到分段错误错误。 I believe the error is given while reading the last line <person[i].status> as if i comment the same line the code runs perfectly.我相信在阅读最后一行 <person[i].status> 时会给出错误,就好像我评论代码运行完美的同一行一样。

Contents of CSV file: CSV文件内容:

1;A;John Mott;D;30;Z
2;B;Judy Moor;S;60;X
3;A;Kae Blanchett;S;42;y
4;B;Jair Tade;S;21;W
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct Person
{
    int id;
    char key;
    char name[16];
    char rel;
    int age;
    char status;
} Person;

int main()
{
    Person person[12];

    FILE *f = fopen("data.csv", "r");
    char buffer[256];

    if (f != NULL)
    {
        int i = 0;
        printf("\nFile OK!\n");
        printf("Printing persons:\n");
        while (fgets(buffer, 256, f))
        {
            person[i].id = atoi(strtok(buffer, ";"));
            person[i].key = strtok(NULL, ";")[0];
            strcpy(person[i].name, strtok(NULL, ";"));
            person[i].rel = strtok(NULL, ";")[0];
            person[i].age = atoi(strtok(buffer, ";"));
            person[i].status = strtok(NULL, ";")[0]; // error: segmentation fault

            printf("id: %d\n", person[i].id);
            printf("key: %c\n", person[i].key);
            printf("name: %s\n", person[i].name);
            printf("rel: %c\n", person[i].rel);
            printf("age: %d\n", person[i].age);
            printf("status: %c\n", person[i].status);
            i++;
        }
    }
    else
    {
        printf("\nFile BAD!\n");
    }

    return 0;
}

Thank you for your help!谢谢您的帮助!

While you have a good answer addressing your problems with strtok() , you may be over-complicating your code by using strtok() to begin with.虽然您有一个很好的答案来解决您的strtok()问题,但您可能会通过使用strtok()开始使您的代码过于复杂。 When reading a delimited file with a fixed delimiter, reading a line-at-a-time into a sufficiently sized buffer and then separating the buffer into the needed values with sscanf() can provide a succinct (and in the case of your use of atoi() a more robust) solution.读取具有固定分隔符的分隔文件时,一次将一行读取到足够大小的缓冲区中,然后使用sscanf()将缓冲区分隔为所需的值可以提供简洁的(并且在您使用的情况下) atoi()一个更强大的)解决方案。

Your fields are easily separated in this case using a carefully crafted format-string .在这种情况下,您可以使用精心设计的格式字符串轻松分隔您的字段。 For example, reading each line into a buffer ( buf in this case) you can separate each of the lines into the needed values with:例如,将每一行读入缓冲区(在这种情况下为buf ),您可以将每一行分隔为所需的值:

        if (sscanf (buf, "%d;%c;%15[^;];%c;%d;%c",      /* convert to person/VALIDATE */
                    &person[n].id, &person[n].key, person[n].name,
                    &person[n].rel, &person[n].age, &person[n].status) == 6)

The conversion to int by sscanf() at least minimally validates the integer conversion.通过sscanf()转换为int至少可以最低限度地验证 integer 转换。 Not so with atoi() which will happily take atoi ("my cow") and fail silently returning zero without any indication things have gone wrong. atoi()不是这样,它会很高兴地接受atoi ("my cow")并失败默默地返回零,而没有任何迹象表明事情出了问题。

Note, in every conversion to string, you must provide a field-width modifier to limit the number of characters stored to one less than your array can hold (saving room for the '\0' nul-terminating character).请注意,在每次转换为字符串时,您必须提供一个字段宽度修饰符,以将存储的字符数限制为比您的数组可以容纳的少一个(为'\0' nul 终止字符节省空间)。 Otherwise the use of the scanf() family "%s" or "%[..]" is no safer than using gets() .否则,使用scanf()系列"%s""%[..]"并不比使用gets()更安全。 See Why gets() is so dangerous it should never be used!请参阅为什么 gets() 如此危险,永远不应该使用它!

The same protection of your array bounds for person[] applies on your read loop.person[]的数组边界的相同保护适用于您的读取循环。 Simply keeping a count of the successful conversions and testing before the next read is all you need, eg只需在下一次读取之前记录成功的转换和测试即可,例如

#define NPERSONS  12        /* if you need a constant, #define one (or more) */
#define MAXNAME   16
#define MAXC    1024
...
    char buf[MAXC];                                     /* buffer to hold each line */
    size_t n = 0;                                       /* person counter/index */
    Person person[NPERSONS] = {{ .id = 0 }};            /* initialize all elements */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    ...
    while (n < NPERSONS && fgets (buf, MAXC, fp)) {     /* protect array, read line   */
        if (sscanf (buf, "%d;%c;%15[^;];%c;%d;%c",      /* convert to person/VALIDATE */
                    &person[n].id, &person[n].key, person[n].name,
                    &person[n].rel, &person[n].age, &person[n].status) == 6)
            n++;        /* increment count on good conversion */
    }

As shown with the #define s above, don't use MagicNumbers in your code.如上面的#define所示,不要在代码中使用MagicNumbers (eg 12 , 16 ). (例如12 16 Instead declare a constant at the top of your code that provides a convenient single-location to change if your limits later need adjustment.而是在代码顶部声明一个常量,以便在以后需要调整限制时提供方便的单一位置进行更改。

In the same vein, do not hardcode filenames.同样,不要硬编码文件名。 There is no reason you should have to re-compile your code just to read from a different file.没有理由为了从不同的文件中读取而必须重新编译代码。 Pass the filename as the first argument to your program (that's what argc and argv are for), or prompt the user and take the filename as input.将文件名作为第一个参数传递给程序(这就是argcargv的用途),或者提示用户并将文件名作为输入。 Above, the code takes the filename as the first argument, or reads from stdin by default if no argument is provided (like most Unix utilities do).上面,代码将文件名作为第一个参数,如果没有提供参数,则默认从标准输入读取(就像大多数stdin实用程序一样)。

Putting that altogether, you could do something similar to:总而言之,你可以做类似的事情:

#include <stdio.h>

#define NPERSONS  12        /* if you need a constant, #define one (or more) */
#define MAXNAME   16
#define MAXC    1024

typedef struct Person {
    int id;
    char key;
    char name[MAXNAME];
    char rel;
    int age;
    char status;
} Person;

int main (int argc, char **argv) {

    char buf[MAXC];                                     /* buffer to hold each line */
    size_t n = 0;                                       /* person counter/index */
    Person person[NPERSONS] = {{ .id = 0 }};            /* initialize all elements */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    while (n < NPERSONS && fgets (buf, MAXC, fp)) {     /* protect array, read line   */
        if (sscanf (buf, "%d;%c;%15[^;];%c;%d;%c",      /* convert to person/VALIDATE */
                    &person[n].id, &person[n].key, person[n].name,
                    &person[n].rel, &person[n].age, &person[n].status) == 6)
            n++;        /* increment count on good conversion */
    }
    if (fp != stdin)   /* close file if not stdin */
        fclose (fp);
    
    for (size_t i = 0; i < n; i++)                      /* output results */
        printf ("person[%zu]  %3d  %c  %-15s  %c  %3d  %c\n", i,
                person[i].id, person[i].key, person[i].name,
                person[i].rel, person[i].age, person[i].status);
}

( note: you only need one call to printf() to output any contiguous block of output with conversions. If you have no conversions required, use puts() or fputs() if end-of-line control is needed) 注意:您只需要一次调用printf()到 output 任何 output 的连续块并进行转换。如果您不需要转换,请使用puts()fputs()如果需要行尾控制)

Lastly, do not skimp on buffer size .最后,不要吝啬缓冲区大小 16 seems horribly short for a name field ( 64 is still pushing it). 16对于name字段来说似乎非常短( 64仍在推动它)。 By using the field-width modifier you are protected against Undefined Behavior due to overwriting your array bounds (the code will simply skip the line), but you should add an else {... } condition to output an error in that case.通过使用字段宽度修饰符,您可以防止由于覆盖数组边界而导致未定义行为(代码将简单地跳过该行),但您应该向 output 添加一个else {... }条件,在这种情况下会出错。 16 is sufficient for your example data, but for general use, you would want to adjust that to a larger value. 16对于您的示例数据就足够了,但对于一般用途,您可能希望将其调整为更大的值。

Example Use/Output示例使用/输出

With your sample input in the file named dat/person_id-status.txt , you could do:使用名为dat/person_id-status.txt的文件中的示例输入,您可以执行以下操作:

$ ./bin/person_id-status dat/person_id-status.txt
person[0]    1  A  John Mott        D   30  Z
person[1]    2  B  Judy Moor        S   60  X
person[2]    3  A  Kae Blanchett    S   42  y
person[3]    4  B  Jair Tade        S   21  W

Those there the main points that struct me looking over your code.那些结构我查看您的代码的要点。 (I'm sure I've forgotten to mention one or two more) Look things over and let me know if you have further questions. (我确定我忘了再提一两个)看看事情,如果你还有其他问题,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM