I am trying to read a text file with the following format, using fgets() and strtok().
1082018 1200 79 Meeting with President
2012018 1200 79 Meet with John at cinema
2082018 1400 30 games with Alpha
3022018 1200 79 sports
I need to separate the first value from the rest of the line, for example:
key=21122019, val = 1200 79 Meeting with President
To do so I am using strchr()
for val
and strtok()
for key
, however, the key value remains unchanged when reading from file. I can't understand why this is happening since I am allocating space for in_key inside the while loop and placing inside an array at a different index each time.
My code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define N 1000 // max number of lines to be read
#define VALLEN 100
#define MAXC 1024
#define ALLOCSIZE 1000 /*size of available space*/
static char allocbuf[ALLOCSIZE]; /* storage for alloc*/
static char *allocp = allocbuf; /* next free position*/
char *alloc(int n) { /* return a pointer to n characters*/
if (allocbuf + ALLOCSIZE - allocp >= n) { /*it fits*/
allocp += n;
return allocp - n; /*old p*/
} else /*not enough room*/
return 0;
}
int main(int argc, char** argv) {
FILE *inp_cal;
inp_cal = fopen("calendar.txt", "r+");
char buf[MAXC];
char *line[1024];
char *p_line;
char *in_val_arr[100];
char *in_key_arr[100];
int count = 0;
char delimiter[] = " ";
if (inp_cal) {
printf("Processing file...\n");
while (fgets(buf, MAXC, inp_cal)) {
p_line = malloc(strlen(buf) + 1); // malloced with size of buffer.
char *in_val;
char *in_key;
strcpy(p_line, buf); //used to create a copy of input buffer
line[count] = p_line;
/* separating the line based on the first space. The words after
* the delimeter will be copied into in_val */
char *copy = strchr(p_line, ' ');
if (copy) {
if ((in_val = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
} else {
strcpy(in_val, copy + 1);
printf("arr: %s", in_val);
in_val_arr[count] = in_val;
}
} else
printf("Could not find a space\n");
/* We now need to get the first word from the input buffer*/
if ((in_key = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
}
else {
in_key = strtok(buf, delimiter);
printf("%s\n", in_key);
in_key_arr[count] = in_key; // <-- Printed out well
count++;
}
}
for (int i = 0; i < count; ++i)
printf("key=%s, val = %s", in_key_arr[i], in_val_arr[i]); //<-- in_key_arr[i] contains same values throughout, unlike above
fclose(inp_cal);
}
return 0;
}
while-loop output (correct):
Processing file...
arr: 1200 79 Meeting with President
1082018
arr: 1200 79 Meet with John at cinema
2012018
arr: 1400 30 games with Alpha
2082018
arr: 1200 79 sports
3022018
for-loop output (incorrect):
key=21122019, val = 1200 79 Meeting with President
key=21122019, val = 1200 79 Meet with John
key=21122019, val = 1400 30 games with Alpha
key=21122019, val = 1200 79 sports
Any suggestions on how this can be improved and why this is happening? Thanks
Continuing for the comment, in attempting to use strtok
to separate your data into key, val, somenum
and the remainder of the line as a string, you are making things harder than it need be.
If the beginning of your lines are always:
key val somenum rest
you can simply use sscanf
to parse key, val
and somenum
into, eg three unsigned
values and the rest of the line into a string. To help preserve the relationship between each key, val, somenum
and string
, storing the values from each line in a struct
is greatly ease keeping track of everything. You can even allocate for the string
to minimize storage to the exact amount required. For example, you could use something like the following:
typedef struct { /* struct to handle values */
unsigned key, val, n;
char *s;
} keyval_t;
Then within main()
you could allocate for some initial number of struct, keep an index as a counter, loop reading each line using a temporary stuct and buffer, then allocating for the string ( +1
for the nul-terminating character) and copying the values to your struct. When the number of structs filled reaches your allocated amount, simply realloc
the number of structs and keep going.
For example, let's say you initially allocate for NSTRUCT
struts and read each line into buf
, eg
...
#define NSTRUCT 8 /* initial struct to allocate */
#define MAXC 1024 /* read buffer size (don't skimp) */
...
/* allocate/validate storage for max struct */
if (!(kv = malloc (max * sizeof *kv))) {
perror ("malloc-kv");
return 1;
}
...
size_t ndx = 0, /* used */
max = NSTRUCT; /* allocated */
keyval_t *kv = NULL; /* ptr to struct */
...
while (fgets (buf, MAXC, fp)) { /* read each line of input */
...
Within your while
loop, you simply need to parse the values with sscanf
, eg
char str[MAXC];
size_t len;
keyval_t tmp = {.key = 0}; /* temporary struct for parsing */
if (sscanf (buf, "%u %u %u %1023[^\n]", &tmp.key, &tmp.val, &tmp.n,
str) != 4) {
fprintf (stderr, "error: invalid format, line '%zu'.\n", ndx);
continue;
}
With the values parsed, you check whether your index has reached the number of struct you have allocated and realloc
if required (note the use of a temporary pointer to realloc
), eg
if (ndx == max) { /* check if realloc needed */
/* always realloc with temporary pointer */
void *kvtmp = realloc (kv, 2 * max * sizeof *kv);
if (!kvtmp) {
perror ("realloc-kv");
break; /* don't exit, kv memory still valid */
}
kv = kvtmp; /* assign new block to pointer */
max *= 2; /* increment max allocated */
}
Now with storage for the struct
, simply get the length of the string, copy the unsigned
values to your struct, and allocate length + 1
chars for kv[ndx].s
and copy str
to kv[ndx].s
, eg
len = strlen(str); /* get length of str */
kv[ndx] = tmp; /* assign tmp values to kv[ndx] */
kv[ndx].s = malloc (len + 1); /* allocate block for str */
if (!kv[ndx].s) { /* validate */
perror ("malloc-kv[ndx].s");
break; /* ditto */
}
memcpy (kv[ndx++].s, str, len + 1); /* copy str to kv[ndx].s */
}
( note: you can use strdup
if you have it to replace malloc
through memcpy
with kv[ndx].s = strdup (str);
, but since strdup
allocates, don't forget to check kv[ndx].s != NULL
before incrementing ndx
if you go that route)
That's pretty much the easy and robust way to capture your data. It is now contained in an allocated array of struct which you can use as needed, eg
for (size_t i = 0; i < ndx; i++) {
printf ("kv[%2zu] : %8u %4u %2u %s\n", i,
kv[i].key, kv[i].val, kv[i].n, kv[i].s);
free (kv[i].s); /* free string */
}
free (kv); /* free stucts */
(don't forget to free the memory you allocate)
Putting it altogether, you could do something like the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NSTRUCT 8 /* initial struct to allocate */
#define MAXC 1024 /* read buffer size (don't skimp) */
typedef struct { /* struct to handle values */
unsigned key, val, n;
char *s;
} keyval_t;
int main (int argc, char **argv) {
char buf[MAXC]; /* line buffer */
size_t ndx = 0, /* used */
max = NSTRUCT; /* allocated */
keyval_t *kv = NULL; /* ptr to struct */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("fopen-file");
return 1;
}
/* allocate/validate storage for max struct */
if (!(kv = malloc (max * sizeof *kv))) {
perror ("malloc-kv");
return 1;
}
while (fgets (buf, MAXC, fp)) { /* read each line of input */
char str[MAXC];
size_t len;
keyval_t tmp = {.key = 0}; /* temporary struct for parsing */
if (sscanf (buf, "%u %u %u %1023[^\n]", &tmp.key, &tmp.val, &tmp.n,
str) != 4) {
fprintf (stderr, "error: invalid format, line '%zu'.\n", ndx);
continue;
}
if (ndx == max) { /* check if realloc needed */
/* always realloc with temporary pointer */
void *kvtmp = realloc (kv, 2 * max * sizeof *kv);
if (!kvtmp) {
perror ("realloc-kv");
break; /* don't exit, kv memory still valid */
}
kv = kvtmp; /* assign new block to pointer */
max *= 2; /* increment max allocated */
}
len = strlen(str); /* get length of str */
kv[ndx] = tmp; /* assign tmp values to kv[ndx] */
kv[ndx].s = malloc (len + 1); /* allocate block for str */
if (!kv[ndx].s) { /* validate */
perror ("malloc-kv[ndx].s");
break; /* ditto */
}
memcpy (kv[ndx++].s, str, len + 1); /* copy str to kv[ndx].s */
}
if (fp != stdin) /* close file if not stdin */
fclose (fp);
for (size_t i = 0; i < ndx; i++) {
printf ("kv[%2zu] : %8u %4u %2u %s\n", i,
kv[i].key, kv[i].val, kv[i].n, kv[i].s);
free (kv[i].s); /* free string */
}
free (kv); /* free stucts */
}
Example Use/Output
Using your data file as input, you would receive the following:
$ ./bin/fgets_sscanf_keyval <dat/keyval.txt
kv[ 0] : 1082018 1200 79 Meeting with President
kv[ 1] : 2012018 1200 79 Meet with John at cinema
kv[ 2] : 2082018 1400 30 games with Alpha
kv[ 3] : 3022018 1200 79 sports
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind
is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/fgets_sscanf_keyval <dat/keyval.txt
==6703== Memcheck, a memory error detector
==6703== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6703== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6703== Command: ./bin/fgets_sscanf_keyval
==6703==
kv[ 0] : 1082018 1200 79 Meeting with President
kv[ 1] : 2012018 1200 79 Meet with John at cinema
kv[ 2] : 2082018 1400 30 games with Alpha
kv[ 3] : 3022018 1200 79 sports
==6703==
==6703== HEAP SUMMARY:
==6703== in use at exit: 0 bytes in 0 blocks
==6703== total heap usage: 5 allocs, 5 frees, 264 bytes allocated
==6703==
==6703== All heap blocks were freed -- no leaks are possible
==6703==
==6703== For counts of detected and suppressed errors, rerun with: -v
==6703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me now if you have any further questions. If you need to further split kv[i].s
, then you can think about using strtok
.
You are storing the same pointer in the in_key_arr
over and over again.
You roughly need this:
in_key = strtok(buf, delimiter);
printf("%s\n", in_key);
char *newkey = malloc(strlen(in_key) + 1); // <<<< allocate new memory
strcpy(newkey, in_key);
in_key_arr[count] = newkey; // <<<< store newkey
count++;
Disclaimer:
you are assigning an address with the call to alloc then reassigning with call to strtok? rewriting the same address? Copy return from strtok to in_key?
char *copy = strchr(p_line, ' ');
if (copy) {
if ((in_val = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
} else {
printf("arr: %ul\n", in_val);
strcpy(in_val, copy + 1);
printf("arr: %s", in_val);
in_val_arr[count] = in_val;
}
} else
printf("Could not find a space\n");
/* We now need to get the first word from the input buffer*/
if ((in_key = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
}
else {
printf("key: %ul\n", in_key);
in_key = strtok(buf, delimiter);
printf("key:\%ul %s\n",in_key, in_key);
in_key_arr[count++] = in_key; // <-- Printed out well
}
output:
allocbuf: 1433760064l
Processing file...
all: 1433760064l
arr: 1433760064l
arr: 1200 79 Meeting with President
all: 1433760104l
key: 1433760104l
key:4294956352l 1082018
this change fixed it:
strcpy(in_key, strtok(buf, delimiter));
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.