I'm trying to read CSV from a text file in C. The text file format is
1,Bob,bob@gmail.com
2,Daniel,daniel@gmail.com
3,John,john@gmail.com
When I run the program, the number displays fine but the name and email are being displayed as garbage. Here is my program...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
int number;
char* name;
char* email;
} Owner;
Owner owners[100];
int load(char* filename)
{
char buffer[200];
char token[50];
Owner* owner;
int owners_size = 0;
FILE* file = fopen(filename, "r");
while(fgets(buffer, 200, file) != NULL)
{
owner = (Owner*)malloc(sizeof(Owner));
owner->number = atoi(strtok(buffer, ","));
owner->name = strtok(NULL, ",");
owner->email = strtok(NULL, ",");
owners[owners_size++] = *owner;
}
fclose(file);
return owners_size;
}
int main()
{
int choise, owners_size, index;
char* owners_filename = "owners2.txt";
owners_size = load(owners_filename);
if(owners_size)
{
printf("owners size: %d\n\n", owners_size);
for(index = 0; index < owners_size; index++)
printf("%d, %s %s\n", owners[index].number, owners[index].name, owners[index].email);
}
}
Can anyone tell me what the reason is. I appreciate your help.
Two problems:
You didn't allocate space for the strings in the structure:
typedef struct { int number; char *name; char *email; } Owner;
You need to provide space for those pointers to point at to hold the names.
You keep on supplying pointers to the buffer which is reused for each line of input:
while(fgets(buffer, 200, file) != NULL) { owner = (Owner*)malloc(sizeof(Owner)); owner->number = atoi(strtok(buffer, ",")); owner->name = strtok(NULL, ","); owner->email = strtok(NULL, ","); owners[owners_size++] = *owner; }
The first line gets stored as some pointers into the buffer. The next line then overwrites the buffer and chops the line up again, trampling all over the original input.
Consider using strdup()
:
while (fgets(buffer, 200, file) != NULL)
{
owner = (Owner *)malloc(sizeof(Owner));
owner->number = atoi(strtok(buffer, ","));
owner->name = strdup(strtok(NULL, ","));
owner->email = strdup(strtok(NULL, ","));
owners[owners_size++] = *owner;
}
This is slightly dangerous code (I'd not use it in production code) because it doesn't check that strtok()
found a token when expected (or that strdup()
was successful). There again, I wouldn't use strtok()
in production code either; I'd use POSIX strtok_r()
or Microsoft's strtok_s()
if they were available, or some alternative technique, probably using strspn()
and strcspn()
. If strdup()
is not available, you can write your own, with the same or a different name:
char *strdup(const char *str)
{
size_t len = strlen(str) + 1;
char *dup = malloc(len);
if (dup != 0)
memmove(dup, str, len); // Or memcpy() - that is safe in this context
return(dup);
}
You might note that your code is only suitable for simple CSV files. If you encountered a line like this (which is legitimate CSV), you'd have problems (with quotes in your values, and mis-splitting because of the comma inside the quoted string):
1,"Bob ""The King"" King","Bob King, Itinerant Programmer <bob@gmail.com>"
The pointer returned by strtok()
points to an address within the buffer it is parsing, in this case the local variable buffer
. When load()
returns the variable it is out of scope (even if it wasn't all instances of owners
would be pointing the same address). You need to copy the string returned by strtok()
. You could use strdup()
if available or use malloc()
and strcpy()
.
There is no need to malloc()
new instances of Owner
as an array of them already exist (the code as is stands has a memory leak).
Note there is no protection against going beyond the bounds of the owners
array. If the file has more than 100
entries then the loop will go beyond the bounds of the array. Extend the terminating condition of the while
to prevent this:
while(owners_size < sizeof(owners) / sizeof(owners[0]) &&
fgets(buffer, 200, file) != NULL)
{
}
You just stored pointers into a local buffer. When you leave load()
this buffer
is gone and not accessible anymore.
You must allocate memory for name
and email
before you can copy it into the Owner
struct.
char *tok;
tok = strtok(NULL, ",");
len = strlen(tok);
owner->name = malloc(len + 1);
strcpy(owner->name, tok);
...
[EDIT: you need to allocate len+1
bytes so you have space for the terminating NUL
character. -Zack]
You've only got one line buffer. Every cycle of the loop in load
clobbers the text from the previous cycle. And if that wasn't bad enough, the buffer is destroyed when load
returns.
The quick fix is to change
owner->name = strtok(NULL, ",");
owner->email = strtok(NULL, ",");
to
owner->name = strdup(strtok(NULL, ","));
owner->email = strdup(strtok(NULL, ","));
(If you don't have strdup
, get a real computer it's very simple to write.)
If I were reviewing your code, though, I would ding you for the fixed-size line buffer, the fixed-size owners array, the memory leak, using atoi
instead of strtol
, using strtok
instead of strsep
, and the absence of quote handling and parse error recovery, and point out that it would be more efficient to allocate each line as a unit and then save pointers into it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.