简体   繁体   中英

strtok() with realloc() weird behaviour

I have the following program written in C:

    ...
    char *answer = NULL;
    char *pch = strtok(phrase, " "); // phrase is a string with possibly many words
    while (pch) {
        char *tmp = translate_word(pch); // returns a string based on pch
        void *ptr = realloc(answer, sizeof(answer) + sizeof(tmp) + 1000); // allocate space to answer
        if (!ptr) // If realloc fails
             return -1;
        strcat(answer, tmp); // append tmp to answer
        pch = strtok(NULL, " "); // find next word
    }
    ...

The problem is that strtok() shows weird behavior, it returns a word that does not exist in the phrase string but is part of the answer string.

On the other hand, when I change the following line:

void *ptr = realloc(answer, sizeof(answer) + sizeof(tmp) + 1000);

to:

void *ptr = realloc(answer, sizeof(answer) + sizeof(tmp) + 1);

strok() works as expected.

How is it possible that realloc() affects strtok() in this case? They do not even use the same variables. Looking forward to your insights.

The realloc function could move the memory that was previously allocated. After the call, the pointer to the allocated memory is returned and the pointer value passed to it, if it differs, is no longer valid. So when you call strcat(answer, tmp); you're potentially writing to freed memory which invokes undefined behavior , and in this case it manifests as the strange output you're seeing.

After checking the return value of realloc , assign that value back to answer .

Also, sizeof(answer) and sizeof(tmp) give you the size of the pointer, not the size of what it points to. You instead want to use strlen to get the length of the string then contain. And while we're at it, lets just add 1 to this instead of 1000 because that's all you actually need.

    void *ptr = realloc(answer, strlen(answer) + strlen(tmp) + 1);
    if (!ptr)
         return -1;
    answer = ptr;
    strcat(answer, tmp);

One more issue: the first time realloc is called the memory is completely uninitialized. Subsequently calling strcat on it depends on answer containing a null terminated string. It doesn't so this also invokes undefined behavior.

This can be fixed by malloc -ing a single byte to start and setting it to 0, that way you start with an empty string.

char *answer = malloc(1);
if (!answer) return -1;
answer[0] = 0;

sizeof(answer) & sizeof(tmp) gives you sizes of the pointers.

You need to use strlen instead

additionally...

char *answer = NULL;

... either:

... strlen(answer) ...
strcat(answer, tmp);

These SHOULD fail, with a segmentation violation, but may not depending on the OS. Dereferencing NULL is never a good idea.

In short, you need to either know you have assigned something to answer , or to check if answer is NULL.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM