简体   繁体   中英

Why does freeing a string in a trie result in a malloc error?

I need help understanding a debug message that I got from malloc . I wrote a function that deletes a specified string from a trie by recursively traversing that string on the trie and freeing everything that isn't used by another string. This involves going down to the very last node and freeing it, then going back up the stack and checking each level to see if that level is also unused by other strings, and if so, freeing them. Once it gets to the first one that is used by something else, it stops.

The function seems to work fine when I only remove one string, but when I remove a second string, I start having problems. Here's the entire code of my program so far (note that some of the lines are there for testing/debugging purposes and are not integral to the program):

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

struct node {
    char chr;
    bool end;
    struct node *children[128];
};

void add( struct node *, char * );
void del( struct node *, char * );
bool isMember( struct node *, char * );

bool recursiveDel( struct node *, char * );
// del is really just a dummy function that calls recursiveDel.

int main( int argc, char **argv ){
    struct node *trie = (struct node *) malloc( sizeof( struct node ) );
    for( int i = 1; i < argc; i++ ){
        add( trie, argv[i] );
    }
    del( trie, argv[1] );
    del( trie, argv[2] );
    for( int i = 1; i < argc; i++ ){
        printf( "%d\n", isMember( trie, argv[i] ) );
    }
    return 0;
}

void add( struct node *trie, char *str ){
    int i = 0;
    while( str[i] ){
        // Check/goto next node
        // If NULL, create next node
        if( trie->children[str[i]] == NULL )
            trie->children[str[i]] = (struct node *) malloc( sizeof( struct node ) );
        trie = trie->children[str[i++]];
    }
    trie->end = true;
}

void del( struct node *trie, char *str ){
    if( isMember( trie, str ) ){
        recursiveDel( trie, str );
    }
}

bool isMember( struct node *trie, char *str ){
    int i = 0;
    struct node *cur = trie;
    while( str[i] ){
        if( trie->children[str[i]] == NULL ) return false;
        else trie = trie->children[str[i++]];
    }
    return trie->end;
}

// Features of this function:
// When it gets to the leaf, it deletes that node and then starts going back up the call stack
// Each call passes a Boolean value back up the call stack.
// This boolean value indicates whether or not the node was deleted.
// If the value returned from the lower node is true, then that means check the next node up to see if it should be deleted.
// If false do nothing, because there are other strings using this node.
bool recursiveDel( struct node *trie, char *str ){
    printf( "%p, %d, %s\n", trie, trie->end, str );
    if( trie->end ){
        free( trie );
        return true;
    }
    bool deleted = recursiveDel( trie->children[str[0]], str+1 );
    if( deleted ){
        int used = 0;
        // Loop checks to see if the node
        // is used by any other strings.
        for( int i = 0; i < 128; i++ ){
            if( trie->children[i] ){
                used++;
                break;
            }
        }
        if( used <= 1 ){
            free( trie );
            return true;
        }
    }
    return false;
}

The problem appears to occur in this block, where I attempted to free the terminating node for the string:

    if( trie->end ){
        free( trie );
        return true;
    }

I get a message saying that the node can't be freed because it doesn't exist...

bash-3.2$ ./trie hello world
0x7fd370802000, 0, hello
0x7fd370800600, 0, ello
0x7fd370802600, 0, llo
0x7fd370802c00, 0, lo
0x7fd370803200, 0, o
0x7fd370803800, 1,
0x7fd370802000, 0, world
0x7fd370803e00, 0, orld
0x7fd370804400, 0, rld
0x7fd370804a00, 0, ld
0x7fd370805000, 0, d
0x7fd370805600, 1,
trie(43216,0x7fff7818e000) malloc: *** error for object 0x7fd370802000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
bash-3.2$

It appears to be the case that when I try to remove the second string, it goes to the last node as expected, but then rather than going back up the stack, it keeps going and tries to free the next node after that, which obviously it can't do because this is a leaf node.

It seems like it might be undefined behavior, but at the same time, the program fails in a very predictable way - the first string removal is invariably successful while the second string removal is invariably unsuccessful. I can't make heads or tails of this.

Also, the address given by malloc when it failed seems a little off. The last few addresses all differ by 0x600 , but this address differs from the last one by 0xa00 . I understand that heap memory allocation is unpredictable, but I just thought I'd point that out.

More strange than that is the fact that the address given by malloc is different from the last printed address, despite the fact that the failed free operation immediately follows the last printf . This almost seems to indicate that the compiler is inserting a pointer advancing operation between the printf line and the if( trie->end ) free( trie ) part. Common sense indicates that this is ridiculous, but I don't know of any other explanation.

Relevant portion of your code:

int main( int argc, char **argv ){
    struct node *trie = (struct node *) malloc( sizeof( struct node ) );
    for( int i = 1; i < argc; i++ ){
        add( trie, argv[i] );
    }
    …
}

void add( struct node *trie, char *str ){
    int i = 0;
    while( str[i] ){
        if( trie->children[str[i]] == NULL )
//          ^^^^^^^^^^^^^^^^^^^^^^

You're allocating a struct node , then accessing its .children[...] pointer without initializing it. Undefined behavior.

You need to initialize your node s after allocating them.

Your add function is incorrect here:

trie->children[str[i]] = (struct node *) malloc( sizeof( struct node ) );

You use malloc() that does not initialize the memory block it returns. Same problem in main() . Indeed it would make sense to have 2 different structure types for the trie and its nodes.

You should use calloc() or memset() to initialize the contents to all bits zero, which is good enough for most current architectures to initialize all the children pointers to NULL .

Note also that you have undefined behavior if any of the characters are outside the range 0 .. 127 . You should give the children array 256 entries and cast the char as unsigned char before using it an index.

Another problem: in recursiveDel you should just clear trie->end and only free the node if all its children pointers are NULL too. Same problem for intermediary nodes: you must check that trie->end is false too before freeing the node.

Looking closer to the recursiveDel function, it is broken in multiple ways. Here is a corrected version:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

struct node {
    char chr;
    bool end;
    struct node *children[256];
};

void add(struct node *trie, const char *str);
void del(struct node *trie, const char *str);
bool isMember(struct node *trie, const char *str);

bool recursiveDel(struct node *trie, const char *str);
// del is really just a dummy function that calls recursiveDel.

int main(int argc, char **argv) {
    struct node *trie = calloc(sizeof(struct node), 1);
    for(int i = 1; i < argc; i++) {
        add(trie, argv[i]);
    }
    if (argc > 1)
        del(trie, argv[1]);
    if (argc > 2)
        del(trie, argv[2]);
    for (int i = 1; i < argc; i++) {
        printf("%d\n", isMember(trie, argv[i]));
    }
    return 0;
}

void add(struct node *trie, const char *str) {
    for (int i = 0; str[i]; i++) {
        // Check/goto next node
        // If NULL, create next node
        if (trie->children[(unsigned char)str[i]] == NULL)
            trie->children[(unsigned char)str[i]] = calloc(sizeof(struct node), 1);
        trie = trie->children[(unsigned char)str[i]];
    }
    trie->end = true;
}

void del(struct node *trie, const char *str) {
    if (isMember(trie, str)) {
        recursiveDel(trie, str);
    }
}

bool isMember(struct node *trie, const char *str) {
    for (int i = 0; str[i]; i++) {
        if (trie->children[(unsigned char)str[i]] == NULL)
            return false;
        else
            trie = trie->children[(unsigned char)str[i]];
    }
    return trie->end;
}

// Features of this function:
// When it gets to the leaf, it resets the end flag
// and check if the node can be removed and returns true if so.
// Each call passes a Boolean value back up the call stack.
// This boolean value indicates whether or not the node can be deleted.
// If so, the caller frees it and clears the pointer
// If false do nothing, because there are other strings using this node.
bool recursiveDel(struct node *trie, const char *str) {
    //printf("%p, %d, %s\n", (void *)trie, trie->end, str);
    if (*str) {
        if (!recursiveDel(trie->children[(unsigned char)str[0]], str + 1))
            return false;
        free(trie->children[(unsigned char)str[0]]);
        trie->children[(unsigned char)str[0]] = NULL;
    } else {
        trie->end = false;
    }
    if (trie->end)
        return false;

    for (int i = 0; i < 256; i++) {
        if (trie->children[i])
            return false;
    }
    return true;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM