简体   繁体   中英

Reading CSV from text file in C

I'm trying to read CSV from a text file in C. The text file format is

1,Bob,bob@gmail.com
2,Daniel,daniel@gmail.com
3,John,john@gmail.com

When I run the program, the number displays fine but the name and email are being displayed as garbage. Here is my program...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
    int number;
    char* name;
    char* email;
} Owner;

Owner owners[100];

int load(char* filename)
{
    char buffer[200];
    char token[50];
    Owner* owner;
    int owners_size = 0;
    FILE* file = fopen(filename, "r");

    while(fgets(buffer, 200, file) != NULL)
    {
        owner = (Owner*)malloc(sizeof(Owner));
        owner->number = atoi(strtok(buffer, ","));
        owner->name = strtok(NULL, ",");
        owner->email = strtok(NULL, ",");
        owners[owners_size++] = *owner;
    }

    fclose(file);
    return owners_size;
}

int main()
{
    int choise, owners_size, index;
    char* owners_filename = "owners2.txt";

    owners_size = load(owners_filename);

    if(owners_size)
    {
        printf("owners size: %d\n\n", owners_size);

        for(index = 0; index < owners_size; index++)
            printf("%d, %s %s\n", owners[index].number, owners[index].name, owners[index].email);
    }
}

Can anyone tell me what the reason is. I appreciate your help.

Two problems:

  1. You didn't allocate space for the strings in the structure:

     typedef struct { int number; char *name; char *email; } Owner; 

    You need to provide space for those pointers to point at to hold the names.

  2. You keep on supplying pointers to the buffer which is reused for each line of input:

     while(fgets(buffer, 200, file) != NULL) { owner = (Owner*)malloc(sizeof(Owner)); owner->number = atoi(strtok(buffer, ",")); owner->name = strtok(NULL, ","); owner->email = strtok(NULL, ","); owners[owners_size++] = *owner; } 

    The first line gets stored as some pointers into the buffer. The next line then overwrites the buffer and chops the line up again, trampling all over the original input.

Consider using strdup() :

while (fgets(buffer, 200, file) != NULL)
{
    owner = (Owner *)malloc(sizeof(Owner));
    owner->number = atoi(strtok(buffer, ","));
    owner->name = strdup(strtok(NULL, ","));
    owner->email = strdup(strtok(NULL, ","));
    owners[owners_size++] = *owner;
}

This is slightly dangerous code (I'd not use it in production code) because it doesn't check that strtok() found a token when expected (or that strdup() was successful). There again, I wouldn't use strtok() in production code either; I'd use POSIX strtok_r() or Microsoft's strtok_s() if they were available, or some alternative technique, probably using strspn() and strcspn() . If strdup() is not available, you can write your own, with the same or a different name:

char *strdup(const char *str)
{
    size_t len = strlen(str) + 1;
    char *dup = malloc(len);
    if (dup != 0)
        memmove(dup, str, len);  // Or memcpy() - that is safe in this context
    return(dup);
}

You might note that your code is only suitable for simple CSV files. If you encountered a line like this (which is legitimate CSV), you'd have problems (with quotes in your values, and mis-splitting because of the comma inside the quoted string):

1,"Bob ""The King"" King","Bob King, Itinerant Programmer <bob@gmail.com>"

The pointer returned by strtok() points to an address within the buffer it is parsing, in this case the local variable buffer . When load() returns the variable it is out of scope (even if it wasn't all instances of owners would be pointing the same address). You need to copy the string returned by strtok() . You could use strdup() if available or use malloc() and strcpy() .

There is no need to malloc() new instances of Owner as an array of them already exist (the code as is stands has a memory leak).

Note there is no protection against going beyond the bounds of the owners array. If the file has more than 100 entries then the loop will go beyond the bounds of the array. Extend the terminating condition of the while to prevent this:

while(owners_size < sizeof(owners) / sizeof(owners[0]) &&
      fgets(buffer, 200, file) != NULL)
{
}

You just stored pointers into a local buffer. When you leave load() this buffer is gone and not accessible anymore.

You must allocate memory for name and email before you can copy it into the Owner struct.

char *tok;
tok = strtok(NULL, ",");
len = strlen(tok);
owner->name = malloc(len + 1);
strcpy(owner->name, tok);
...

[EDIT: you need to allocate len+1 bytes so you have space for the terminating NUL character. -Zack]

You've only got one line buffer. Every cycle of the loop in load clobbers the text from the previous cycle. And if that wasn't bad enough, the buffer is destroyed when load returns.

The quick fix is to change

owner->name = strtok(NULL, ",");
owner->email = strtok(NULL, ",");

to

owner->name = strdup(strtok(NULL, ","));
owner->email = strdup(strtok(NULL, ","));

(If you don't have strdup , get a real computer it's very simple to write.)

If I were reviewing your code, though, I would ding you for the fixed-size line buffer, the fixed-size owners array, the memory leak, using atoi instead of strtol , using strtok instead of strsep , and the absence of quote handling and parse error recovery, and point out that it would be more efficient to allocate each line as a unit and then save pointers into it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM