简体   繁体   中英

Why does valgrind report that glibc tsearch() randomly leaks memory?

I'm using the glibc tsearch() API family to store dynamically allocated data blocks in an example program.

I am finding that when I use tsearch() to add several malloc() ed blocks to a tree, then valgrind reports some of those blocks as "possibly lost". While "possibly lost" is not quite the same as "definitely lost", the previous SO advice is generally to investigate what is causing these.

My example program is as follows:

#include <stdio.h>
#include <search.h>
#include <stdlib.h>
#include <signal.h>

struct data {
    int id;
    char *str;
};

static int
compare (const void *a, const void *b) {
    const struct data *data_a = a, *data_b = b;

    if (data_a->id < data_b->id) {
        return -1;
    } else if (data_a->id > data_b->id) {
        return 1;
    } else {
        return 0;
    }
}

int main (int argc, char **argv) {
    void *tree = NULL;
    struct data *d1, *d2, *d3, *d4;

    d1 = malloc(sizeof(struct data));
    d1->id = 10;
    d1->str = "Hello";
    tsearch(d1, &tree, compare);

    d2 = malloc(sizeof(struct data));
    d2->id = 30;
    d2->str = "Goodbye";
    tsearch(d2, &tree, compare);

    d3 = malloc(sizeof(struct data));
    d3->id = 20;
    d3->str = "Thanks";
    tsearch(d3, &tree, compare);

    d4 = malloc(sizeof(struct data));
    d4->id = 40;
    d4->str = "OK";
    tsearch(d4, &tree, compare);

    raise(SIGINT);

    return 0;
}

Note that I'm calling raise(SIGINT) at the end of main() so that valgrind is still able to analyse allocated blocks before they are implicitly free() d.

I am compiling and running under valgrind as follows:

$ gcc ts.c -o ts
$ valgrind --leak-check=full ./ts
==2091== Memcheck, a memory error detector
==2091== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2091== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2091== Command: ./ts
==2091== 
==2091== 
==2091== Process terminating with default action of signal 2 (SIGINT)
==2091==    at 0x4E7AE97: raise (raise.c:51)
==2091==    by 0x1088CE: main (in /home/ubuntu/ts)
==2091== 
==2091== HEAP SUMMARY:
==2091==     in use at exit: 160 bytes in 8 blocks
==2091==   total heap usage: 8 allocs, 0 frees, 160 bytes allocated
==2091== 
==2091== 24 bytes in 1 blocks are possibly lost in loss record 8 of 8
==2091==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2091==    by 0x4F59483: tsearch (tsearch.c:338)
==2091==    by 0x108801: main (in /home/ubuntu/ts)
==2091== 
==2091== LEAK SUMMARY:
==2091==    definitely lost: 0 bytes in 0 blocks
==2091==    indirectly lost: 0 bytes in 0 blocks
==2091==      possibly lost: 24 bytes in 1 blocks
==2091==    still reachable: 136 bytes in 7 blocks
==2091==         suppressed: 0 bytes in 0 blocks
==2091== Reachable blocks (those to which a pointer was found) are not shown.
==2091== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2091== 
==2091== For counts of detected and suppressed errors, rerun with: -v
==2091== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

$ 

Valgrind is reporting that one 24 byte block is lost. If I move the raise(SIGINT) ahead of the d4 allocation and tree add, then no blocks are reported lost.

Why is one block lost when adding 4 blocks, even when none are lost when adding 3 blocks?

It turns out that the glibc tsearch() implementation is a bit naughty and can twiddle low-order bits in pointers to blocks that it has stored in a binary-search tree. It uses low-order bits in pointers to store flags: https://code.woboq.org/userspace/glibc/misc/tsearch.c.html#341

In particular, the implementation uses these macros to set or clear the low-order pointer bit to mark a block as red or black respectively:

#define SETRED(N) (N)->left_node |= ((uintptr_t) 0x1)
#define SETBLACK(N) (N)->left_node &= ~((uintptr_t) 0x1)

The pointers are effectively incremented which makes valgrind think these valid pointers (which are stored in a tsearch() tree) have been overwritten and therefore possibly lost . But these are not lost blocks - They are successfully stored in the binary-search tree, modulo the low-order bit. tsearch() will do the necessary masking of these bits when accessing the tree. tsearch() can do this because malloc() ed blocks are generally aligned to at least even addresses.

Only the blocks that are marked as "red" nodes in the binary tree have this bit set, so the pattern of which are "possibly lost" or not is entirely dependent on how the implementation categorises blocks as "red" or "black" during add, delete and rebalance operations.

So tsearch() s bit-twiddling makes valgrind incorrectly think these blocks are lost. In this case, valgrind is reporting false positives.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM