简体   繁体   中英

C - Segmentation fault using strtok

I have this code where its read multiple files and print a certain value. After reading files, at a certain moment my while loop stop and show a segmentation fault...

Here is my code

int main () {

    const char s[2] = ",";
    const char s2[2] = ":";

    char var1[] = "fiftyTwoWeekHigh\"";
    char *fiftyhigh;
    char *fiftyhigh2;
    char *fiftyhigh_token;
    char *fiftyhigh2_token;
   
    char var2[] = "fiftyTwoWeekLow\"";
    char *fiftylow;
    char *fiftylow2;
    char *fiftylow_token;
    char *fiftylow2_token;

    char var3[] = "regularMarketPrice\"";
    char *price;
    char *price2;
    char *price_token;
    char *price2_token;
   
    FILE *fp;
    char* data = "./data/";
    char* json = ".json";
    char line[MAX_LINES];
    char line2[MAX_LINES];
    int len;
    char* fichier = "./data/indices.txt";

    fp = fopen(fichier, "r");

    if (fp == NULL){
        printf("Impossible d'ouvrir le fichier %s", fichier);
        return 1;
    }

    while (fgets(line, sizeof(line), fp) != NULL) {
        char fname[10000];
        len = strlen(line);
        if (line[len-1] == '\n') {
            line[len-1] = 0;
        }
        
        int ret = snprintf(fname, sizeof(fname), "%s%s%s", data, line, json);
        if (ret < 0) {
            abort();
        }
        printf("%s\n", fname);
        
        FILE* f = fopen(fname, "r");

        while ( fgets( line2, MAX_LINES, f ) != NULL ) {
            fiftyhigh = strstr(line2, var1);
            fiftyhigh_token = strtok(fiftyhigh, s);
            fiftyhigh2 = strstr(fiftyhigh_token, s2);
            fiftyhigh2_token = strtok(fiftyhigh2, s2);
            printf("%s\n", fiftyhigh2_token);

            fiftylow = strstr(line2, var2);
            fiftylow_token = strtok(fiftylow, s);
            fiftylow2 = strstr(fiftylow_token, s2);
            fiftylow2_token = strtok(fiftylow2, s2);
            printf("%s\n", fiftylow2_token);

            price = strstr(line2, var3);
            price_token = strtok(price, s);
            price2 = strstr(price_token, s2);
            price2_token = strtok(price2, s2);
            printf("%s\n", price2_token);
        
            //printf("\n%s\t%s\t%s\t%s\t%s", line, calculcx(fiftyhigh2_token, price2_token, fiftylow2_token), "DIV-1", price2_token, "test");
            
        }
        fclose(f);
    }
    fclose(fp);
    return 0;
}

and the output is:

./data/k.json
13.59
5.31
8.7
./data/BCE.json
60.14
46.03
56.74
./data/BNS.json
80.16
46.38
78.73
./data/BLU.json
16.68
2.7
Segmentation fault

It is like my program stop because it can't reach a certain data at a certain file... Is there a way to allocate more memory? Because my MAX_LINES is already set at 6000.

  • Did you mean '\0'?
if (line[len-1] == '\n') {
  line[len-1] = 0;
}

I advise you to use gdb to see where the segfault occurs and why. I don't think you have to allocate much more memory. But the segfault may happens because you don't have anymore data and you still print the result.

Use if(price2_token,=NULL) printf("%s\n"; price2_token); for example.

I'm assuming that the lines in your file look something like this:

{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100, ... }

In other words it's some kind of JSON format. I'm assuming that the line starts with '{' so each line is a JSON object.

You read that line into line2 , which now contains:

{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100, ... }\0

Note the \0 at the end that terminates the string. Note also that "fiftyTwoWeekLow" comes first, which turns out to be really important.

Now let's trace through the code here:

fiftyhigh = strstr(line2, var1);
fiftyhigh_token = strtok(fiftyhigh, s);

First you call strstr to find the position of "fiftyTwoWeekHigh". This will return a pointer to the position of that field name in the line. Then you call strtok to find the comma that separates this value from the next. I think that this is where things start to go wrong. After the call to strtok , line2 looks like this:

{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100\0 ... }\0

Note that strtok has modified the string: the comma has been replaced with \0 . That's so you can use the returned pointer fiftyhigh_token as a string without seeing all the stuff that came after the comma.

fiftyhigh2 = strstr(fiftyhigh_token, s2);
fiftyhigh2_token = strtok(fiftyhigh2, s2);
printf("%s\n", fiftyhigh2_token);

Next you look for the colon and then call strtok with a pointer to the colon. Since the delimiter you're passing to strok is the colon, strtok ignores the colon and returns the next token, which (because the string we're looking at, which ends after "100," has no more colons) is the rest of the string, in other words, the number.

So you've gotten your number, but probably not in the way you expected? There was really no point in the second call to strtok since (assuming the JSON was well-formed) the position of "100" was just fiftyhigh2+1 .

Now we try to find "fiftyTwoWeekLow:"

fiftylow = strstr(line2, var2);
fiftylow_token = strtok(fiftylow, s);
fiftylow2 = strstr(fiftylow_token, s2);
fiftylow2_token = strtok(fiftylow2, s2);
printf("%s\n", fiftylow2_token);

This is basically the same process, and after you call strtok , line2 like this:

{"fiftyTwoWeekLow":32\0"fiftyTwoWeekHigh":100\0 ... }\0

Note that you're only able to find "fiftyTwoWeekLow" because it comes before "fiftyTwoWeekHigh" in the line. If it had come after, then you'd have been unable to find it due to the \0 added after "fiftyTwoWeekHigh" earlier. In that case, strstr would have returned NULL, which would cause strtok to return NULL, and then you'd definitely have gotten a seg fault after passing NULL to strstr .

So the code is really sensitive to the order in which the fields appear in the line, and it's probably failing because some of your lines have the fields in a different order. Or maybe some fields are just missing from some lines, which would have the same effect.

If you're parsing JSON, you should really use a library designed for that purpose. But if you really want to use strtok then you should:

  1. Read line2 .
  2. Call strtok(line2, ",") once, then repeatedly call strtok(NULL, ",") in a loop until it returns null. This will break up the line into tokens that each look like "someField":100 .
  3. Isolate the field name and value from each of these tokens (just call strchr(token, ':') to find the value). Do not call strtok here, because it will change the internal state of strtok and you won't be able to use strtok(NULL, ",") to continue processing the line.
  4. Test the field name, and depending on its value, set an appropriate variable. In other words, if it's the "fiftyTwoWeekLow" field, set a variable called fiftyTwoWeekLow. You don't have to bother to strip off the quotes, just include them in the string you're comparing with.
  5. Once you've processed all the tokens ( strtok returns NULL), do something with the variables you set.

You may be to pass ",{}" as the delimiter to strtok in order to get rid of any open and close curly braces that surround the line. Or you could look for them in each token and ignore them if they appear.

You could also pass "\"{},:" as the delimiter to strtok . This would cause strtok to emit an alternating sequence of field names and values. You could call strtok once to get the field name, again to get the value, then test the field name and do something with the value.

Using strtok is a pretty primitive way of parsing JSON, but it will will work as long as your JSON only contains simple field names and numbers and doesn't include any strings that themselves contain delimiter characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM