简体   繁体   中英

Tokenizing buffer segmentation fault

So I'm guessing I'm missing something fairly simple here but I am trying to read a file line by line, tokenizing the buffer as I go. I have pasted the basics of what I'm trying to do with my code. I have never had issue with strtok, so I'm guessing it has to do with the buffer that I'm using. Any nudges in the right direction? I read that strtok isn't a great option, but it's the only thing I'm familiar with (I suppose I could write my own function) It reads the first token as it's supposed to every time. It doesn't seg fault until I try to find the second token with "strtok(NULL," ");"

I don't know why this was downvoted as a duplicate. Yes, there are answers out there that tell the basics of what I'm trying to do, but I want to understand the problem, not just cut and paste." I'd prefer to know WHY there is a seg fault and why my code is behaving as it does. No need to downvote when I'm asking specific questions not pointed out directly in other posts.

const char *file = "path/to/file/file.txt";
void tokenize();

//Eventually file will be command line opt
FILE *open_file(const char *file);

int main(int argc, char *argv[])
{
    tokenize();
}

void tokenize()
{
    FILE *fp;
    fp = open_file(file);
    char buffer[BUFSIZ];

    while(fgets(buffer,BUFSIZ,fp) != NULL)
    {
        //puts("========================================");
        //puts(buffer);
        //puts("========================================");

        char *data = strdup(buffer);
        char *token;
        token = strtok(data, " ");
        //puts(token);
        while(token != NULL)
        {
            token = strtok(NULL, " ");

            puts("++++++++++++++++++++++++++++++++++++++++++++++");
            puts(token);
            puts("++++++++++++++++++++++++++++++++++++++++++++++");
        }
    }
fclose(fp)
}

FILE *open_file(const char *file)
{
    FILE *fp;
    fp = fopen(file, "r");

    if(fp == NULL)
    {
        perror("Error opening file");
    }
    return fp;
}        

Your while loop checks that token is not NULL, but then modifies it in the first line of the loop before using it. The second call to strtok() should be at the end of the loop:

    while(token != NULL)
    {
        puts("++++++++++++++++++++++++++++++++++++++++++++++");
        puts(token);
        puts("++++++++++++++++++++++++++++++++++++++++++++++");

        token = strtok(NULL, " ");
    }

Also, don't forget to free(data) at the bottom of your outer while loop. Otherwise, you have a memory leak.

You also have a memory leak with

char *data = strdup(buffer);

strdup uses malloc to allocate memory for the dup string, which it is your responsibility to free . But, you don't, and in every loop you overwrite the previous allocated pointer with another one, resulting in memory leak .

Not really an answer, I re-edited a false answer so downvote as you like.

Thanks everyone! Here is my solution

while(fgets(buffer,BUFSIZ,fp) != NULL)
{
    char *token;
    token = strtok(buffer, " ");
    while(token != NULL)
    {
        token = strtok(NULL, " ");
        **if(token != NULL)**
        {
            printf("%s\n", token);
        }
    }

}

fclose(fp);

As was indicated in other answers, the problem wasn't tokenizing a NULL value, it was in trying to print a NULL value. All was well in the world after I added the check if(token != NULL) within the while loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM