I am new in C, and I'm trying to find a way to read csv
file and output the fifth text in the line until eof
My data looks like this:
05/02/2012 00:00:01.548,XOLT,1ZE86V280394811433,trackthepack,23.22.11.82,en_US, 05/02/2012 00:00:01.605,XOLT,1ZVzVrZVhOaGNtUnZi,hadees,50.16.47.103,en_US,VE 05/02/2012 00:00:01.647,XOLT,1ZbWhoY21GMGFHRnVY,hadees,50.19.203.230,en_US,VE 05/02/2012 00:00:02.275,XOLT,1Z4217060300279193,trackthepack,107.21.159.246,en_US, 05/02/2012 00:00:02.599,XOLT,1Z9X98040398954479,Cascademfg,66.117.15.81,en_US,NF 05/02/2012 00:00:02.639,XOLT,1Z3X252W0363295735,trackthepack,107.22.101.79,en_US,
I would need to read this file and store the value of the fifth text (eg 23.22.11.82) and use it further processing of a match.
In java, I use the following code to split out the csv line
String delims = "[,]";
while ((s1 = in.readLine()) != null && s1.length() != 0){
String[] tokens = s1.split(delims);
Is there a similar way in C? My code works faster if I run it in C, that is the reason.
I was able to try some c code and I was able to read the file (3 records) but it seems that it is not seeing the end of the line and I am hitting a segmentation error. I am using fgets and strtok
THe input file is a variable length file delimiter by comma (,) and I want to get the fifth token in each line and then use it as a lookup key
here is the code :
#include "GeoIP.h"
#include "GeoIPCity.h"
static const char * _mk_NA( const char * p ){
return p ? p : "N/A";
}
int
main(int argc, char *argv[])
{
FILE *f;
FILE *out_f;
GeoIP *gi;
GeoIPRecord *gir;
int generate = 0;
char iphost[50];
char *nextWordPtr = NULL;
int wordCount =0;
char *rechost;
char recbuffer[1000];
char delims[]=",";
const char *time_zone = NULL;
char **ret;
if (argc == 2)
if (!strcmp(argv[1], "gen"))
generate = 1;
gi = GeoIP_open("../data/GeoIPCity.dat", GEOIP_MEMORY_CACHE);
if (gi == NULL) {
fprintf(stderr, "Error opening database\n");
exit(1);
}
f = fopen("city_test.txt", "r");
if (f == NULL) {
fprintf(stderr, "Error opening city_test.txt\n");
exit(1);
}
out_f = fopen("out_city_lookup_test.txt", "w");
if (out_f == NULL) {
fprintf(stderr, "Error opening out_city_lookup_test.txt\n");
exit(1);
}
//** Read the file line by line and get the ip address to use to lookup GeoIP **//
//* while (!feof(f)) {
while (fgets(recbuffer,1001,f) != NULL {
nextWordPtr = strtok (recbuffer,delims);
while (nextWordPtr != NULL & wordCount < 5) {
printf("word%d %s\n",wordCount,nextWordPtr);
if (wordCount == 4 ) {
printf("nextWordPtr %s\n",nextWordPtr);
strcpy(iphost, nextWordPtr);
printf("iphost %s\n",iphost);
}
wordCount++;
nextWordPtr = strtok(NULL,delims);
}
gir = GeoIP_record_by_name(gi, (const char *) iphost);
if (gir != NULL) {
ret = GeoIP_range_by_ip(gi, (const char *) iphost);
time_zone = GeoIP_time_zone_by_country_and_region(gir->country_code, gir->region);
printf("%s\t%s\t%s\t%s\t%s\t%s\t%f\t%f\t%d\t%d\t%s\t%s\t%s\n", iphost,
_mk_NA(gir->country_code),
_mk_NA(gir->region),
_mk_NA(GeoIP_region_name_by_code(gir->country_code, gir->region)),
_mk_NA(gir->city),
_mk_NA(gir->postal_code),
gir->latitude,
gir->longitude,
gir->metro_code,
gir->area_code,
_mk_NA(time_zone),
ret[0],
ret[1]);
fprintf(out_f,"%s\t%s\t%s\t%s\t%s\t%s\t%f\t%f\t%d\t%d\t%s\t%s\t%s\n", iphost,
_mk_NA(gir->country_code),
_mk_NA(gir->region),
_mk_NA(GeoIP_region_name_by_code(gir->country_code, gir->region)),
_mk_NA(gir->city),
_mk_NA(gir->postal_code),
gir->latitude,
gir->longitude,
gir->metro_code,
gir->area_code,
_mk_NA(time_zone),
ret[0],
ret[1]);
GeoIP_range_by_ip_delete(ret);
GeoIPRecord_delete(gir);
}
}
GeoIP_delete(gi);
fclose(out_f);
return 0;
是的,不是那么优雅,但是您可以使用strtok完成工作。
For what you want, a better approach is a lexer . If your end goal is complex, you might want a parser as well.
I've got an example lexer and parser here . It is more complex than what you need though. If you want something simple, strtok will do the job, but you will have several nasty surprises to watch out for. It will also be difficult to use outside the simple case you have presented here.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.