繁体   English   中英

大量使用 memory,未检测到 memory 泄漏

[英]Enormous amount of memory usage, no memory leak detected

在我的程序中查找 memory 泄漏时遇到问题。

top 报告 memory 的使用随着程序运行而增加。 使用 valgrind 分析我的程序时,没有报告 memory 泄漏。

程序由一个“读者”线程和几个“消费者”线程组成。

“reader”线程将数据加载到几个 char** 指针之一,每个“consumer”线程一个。

“消费者”线程处理其对应的 char* 指针的数据并释放 memory。

我已经包含了一些伪代码来描述我的程序正在做什么。 我知道提供的代码可能不足以描述问题。 如果有帮助的话,我很乐意包含整个代码项目。

“读者”线程,为简洁起见

//'nconsumers': number of consumer threads
char ***queue = malloc(nconsumers*sizeof(char **));
for (int i = 0; i < nconsumers; i++) {
    //'length' number of datapoints a 'consumer' works on at a time
    queue[i] = malloc(length*sizeof(char *));
}

char *data = NULL;
int qtracker = 0; //tracks to which 'consumer' data should be assgned
int ltracker = 0; //tracks how many datapoints have been added to each 'consumer'
//loaddata reads data and stores it in 'data' struct
while (loaddata(data) >= 0) {
    char *datapoint = malloc(data->legth); 
    memcpy(datapoint, data->content, data->length);
    queue[qtracker][ltracker] = datapoint;
    qtracker++;
    if (nconsumers == qtracker) { 
        qtracker = 0;
        ltracker++;
        if (length == ltracker) ltracker = 0;
    }
}
//NULL pointers are added to the end of each 'consumer' queues to indicate all data has been read

“消费者”线程

//Consumers are initialized and a queue is assigned to them
int qnum = "some number between 0 and nconsumers";
int datatracker = 0;
char **dataqueue = queue[qnum];

datapoint = dataqueue[datatracker]
datatracker++;
while (datapoint != NULL) {
    //Do work on data
    free(datapoint);
    datapoint = dataqueue[datatracker];
    datatracker++;

    //More synchronization code
}

“消费者”线程正在正确读取数据并按应有的方式处理它。 同样,valgrind 报告没有 memory 泄漏。 当使用 top 或 htop 监视我的进程时,该程序的 memory 使用量不断增加,直到我的机器开始交换。

编辑

我添加了一个完整的程序来重现错误。 这不完全是我遇到问题的程序,因为它包含额外的依赖项。 同样,这个程序产生了 1 个“读者”线程和 N 个消费者线程。 在具有数亿行的大型文本文件(例如 DNA 测序文件)上运行时,htop 稳定显示 memory 的使用量在增长,而 valgrind 显示没有 memory 泄漏,但 Z4069B248C57DD4625F9E0191 除外。

再次感谢大家的帮助!!

在任何现代 linux 盒子中编译和运行

gcc -Wall -o <name> <program.c> -lm -lpthread
./name large_text_file.txt <num_threads> <>

只有这个警告应该出现,因为我在这个例子中使用了提取的指针:

<program>.c: In function ‘consumer’:
<program>.c:244:11: warning: variable ‘line’ set but not used [-Wunused-but-set-variable]
     char *line = NULL;
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <math.h>
#include <unistd.h>

// Data passed to threads
typedef struct {
    //Input file
    FILE *fp;
    //Number of threads
    int numt;
    //Syncronization data
    pthread_mutex_t mtx;
    pthread_cond_t workcond;
    pthread_cond_t readcond;
    int gowork;
    int goread;
    //Tracks how many threads are done analyzing data
    int doneq;
    /*
      Stores "data queues" (1 queue per thread)
      queue ->       [  [ char**    [ char**    [ char**    [ char**    [ char**
len(queue)=numt          [char*]     [char*]     [char*]     [char*]     [char*]
len(queue[n])=maxqueue   [char*]     [char*]     [char*]     [char*]     [char*]
len(queue[n][m])=data      ...         ...         ...         ...         ...
                         [char*]     [char*]     [char*]     [char*]     [char*]
                                 ]           ]           ]           ]          ]
                                                                                ]
    */
    char ***queue;
    //Internal thread ID
    int *threadidx;
    //Maximum number of lines to read
    int maxseqs;
    //Maximum number of lines per thread == maxseqs/numthreads
    int maxqueue;
} thread_t;

/*
Extracts char * pointers from one of the "data queues". Does work with
the data and frees when done.
*/
void *reader(void *threaddata);

/*
Reads lines from text file, copies line content and length into a char * pointer
and adds it to an "analysis queue" to be processed by one of the "consumers"
*/
void *consumer(void *threaddata);

/*
Initializes thread data
*/
int  threadtinit(FILE *fp, int numt, thread_t *threaddata, int maxseqs);

/*
Cleans thread data before exit
*/
void threadtkill(thread_t *threaddata);


int main(int argc, char *argv[])
{
    if (argc < 4) {
        fprintf(stderr, "ERROR: Not enough arguments.\n");
        exit(-1);
    }

    FILE *fp = fopen(argv[1], "r");
    if (!fp) {
        fprintf(stderr, "ERROR: Failed to open input file.\n");
        exit(-1);
    }

    int numt = atoi(argv[2]);
    if (!numt) {
        fprintf(stderr, "ERROR: Please specify number of threads.\n");
        fclose(fp);
        exit(-1);
    }

    int maxseqs = atoi(argv[3]);
    if (!maxseqs) {
        fprintf(stderr, "ERROR: Please specify max number of lines.\n");
        fclose(fp);
        exit(-1);
    }

    //Start data struct for threads
    thread_t threaddata;
    if (!threadtinit(fp, numt, &threaddata, maxseqs)) {
        fprintf(stderr, "ERROR: Could not initialize thread data.\n");
        fclose(fp);
        exit(-1);
    }
    fprintf(stderr, "Thread data initialized.\n");


    //return code
    int ret;

    //pthread creation
    pthread_t readerthread;
    pthread_t *consumerpool = NULL;
    consumerpool = malloc((numt)*sizeof(pthread_t));
    if (!consumerpool) {
        fprintf(stderr, "Failed to allocate threads.\n");
        ret = -1;
        goto exit;
    }

    // Initialize and set thread detached attribute
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

    //Consumer threads
    int thrc;
    for (int i = 0; i < numt; i++) {
        thrc = pthread_create(consumerpool + i,
                              &attr,
                              consumer,
                              (void *)&threaddata);
        if (thrc) {
            fprintf(stderr, "ERROR: Thread creation.\n");
            ret = -1;
            goto exit;
        }
    }

    //Couple of sleeps to keep track of stuff while running
    sleep(1);

    //Reader thread
    thrc = pthread_create(&readerthread,
                          &attr,
                          reader,
                          (void *)&threaddata);
    if (thrc) {
        fprintf(stderr, "ERROR: Thread creation.\n");
        ret = -1;
        goto exit;
    }


    // Free attribute and wait for the other threads
    pthread_attr_destroy(&attr);

    int jrc;
    jrc = pthread_join(readerthread, NULL);
    if (jrc) {
        fprintf(stderr, "Thread error join. Return code: %d\n", jrc);
    }
    for (int i = 0; i < numt; i++) {
        jrc = pthread_join(*(consumerpool + i), NULL);
        if (jrc) {
            fprintf(stderr, "Thread error join. Return code: %d\n", jrc);
            ret = -1;
            goto exit;
        }
    }
    ret = 0;
    exit:
        threadtkill(&threaddata);
        free(consumerpool);
        fprintf(stderr, "Finished.\n");
        return(ret);
}


void *reader(void *readt)
{
    fprintf(stderr, "Reader thread started.\n");
    thread_t *threaddata = readt;
    int numt = threaddata->numt;
    int maxqueue = threaddata->maxqueue;
    int maxseqs = threaddata->maxseqs;
    FILE *fp = threaddata->fp;

    // Array of queues, one per consumer thread
    char ***queue = threaddata->queue;

    // Number of bytes used to store length of line
    size_t bytes = sizeof(ssize_t);
    // Tracks number of lines loaded so far
    size_t nlines = 0;

    // Tracks to which queue data should be added to
    int qtracker = 0;
    // Tracks to which position in any particular queue, data should be added
    int ltracker = 0;

    // Holds read line
    char *line = NULL;
    ssize_t linelength = 0;
    size_t n;

    // Tracks how much data will be read
    size_t totallength = 0;
    size_t totallines = 0;
    while ( (linelength =  getline(&line, &n, fp)) != -1 ) {
        // enough data is used to hold line contents + line length
        char *data = malloc(bytes + linelength + 1);

        if (!data) {
            fprintf(stderr, "memerr\n");
            continue;
        }
        // move line lenght bytes to data
        memcpy(data, &linelength, bytes);
        //move line bytes to data
        memcpy(data + bytes, line, linelength + 1);

        totallength += linelength;

        // Add newly allocated data to one of numt queues
        queue[qtracker][ltracker] = data;
        qtracker++;
        if (numt == qtracker) {
            // Loop around queue
            qtracker = 0;
            ltracker++;
            // Loop around positions in queue
            if (maxqueue == ltracker) ltracker = 0;
        }
        nlines++;
        // Stop reading thread and start consumer threads
        if (nlines == maxseqs) {
            fprintf(stderr, "%lu lines loaded\n", nlines);
            sleep(3);
            totallines += nlines;
            nlines = 0;
            fprintf(stderr, "Waking up consumers\n");
            pthread_mutex_lock(&(threaddata->mtx));
            //Wake consumer threads
            threaddata->gowork = 1;
            pthread_cond_broadcast(&(threaddata->workcond));
            //Wait for consumer threads to finish
            while ( !threaddata->goread ) {
                pthread_cond_wait(&(threaddata->readcond),
                                  &(threaddata->mtx));
            }
            fprintf(stderr, "Reader has awoken!!!!\n\n");
            sleep(3);
            threaddata->goread = 0;
            pthread_mutex_unlock(&(threaddata->mtx));
        }
    }

    //Add NULL pointers to the end of each queue to indicate reading is done
    pthread_mutex_lock(&(threaddata->mtx));
    for (int i = 0; i < numt; i++) {
        queue[i][ltracker] = NULL;
    }
    // Wake consumers for the last time
    threaddata->gowork = 1;
    pthread_cond_broadcast(&(threaddata->workcond));
    pthread_mutex_unlock(&(threaddata->mtx));

    // Log info
    fprintf(stderr, "%lu characters read.\n", totallength);
    if (line) free(line);
    pthread_exit(NULL);
}


void *consumer(void *consumert)
{
    thread_t *threaddata = consumert;
    // Number of consumer threads
    int numt = threaddata->numt;
    // Max length of queue to extract data from
    int maxqueue = threaddata->maxqueue;

    // Holds data sent by reader thread
    char *data = NULL;
    // Holds the actual line read
    char *line = NULL;
    size_t linelength;
    size_t bytes = sizeof(ssize_t);

    // get queue number for corresponding thread
    int qnum = -1;
    pthread_mutex_lock(&(threaddata->mtx));
    int *tlist = threaddata->threadidx;
    while (qnum == -1) {
        qnum = *tlist;
        *tlist = -1;
        tlist++;
    }
    fprintf(stderr, "Thread got queueID: %d.\n", qnum);
    pthread_mutex_unlock(&(threaddata->mtx));
    // Any thread works on only one and one queue only
    char **queue = threaddata->queue[qnum];

    //After initializing, wait for reader to start working
    pthread_mutex_lock(&(threaddata->mtx));
    while ( !threaddata->gowork) {
        pthread_cond_wait(&(threaddata->workcond), &(threaddata->mtx));
    }
    fprintf(stderr, "Consumer thread started queueID %d.\n", qnum);
    pthread_mutex_unlock(&(threaddata->mtx));

    // Tracks number of characters this thread consumes
    size_t totallength = 0;
    // Tracks from which position in queue data should be taken from
    size_t queuecounter = 1;
    // Get first data point
    data = queue[0];

    while (data != NULL) {
        //get line length
        memcpy(&linelength, data, bytes);

        //get line
        line = data + bytes;

        //Do work
        totallength += linelength;
        free(data);

        //Check for number of sequences analyzed
        if (queuecounter == maxqueue) {
            // Wait for other threads to catchup
            sleep(1);
            queuecounter = 0;
            pthread_mutex_lock(&(threaddata->mtx));
            threaddata->doneq++;
            threaddata->gowork = 0;
            // If this thread is the last one to be done with its queue, wake
            // reader
            if (threaddata->doneq == numt) {
                threaddata->goread = 1;
                pthread_cond_signal(&(threaddata->readcond));
                threaddata->doneq = 0;
            }
            // When done consuming data, wait for reader to load more
            while (!threaddata->gowork) {
                pthread_cond_wait(&(threaddata->workcond),
                                  &(threaddata->mtx));
            }
            pthread_mutex_unlock(&(threaddata->mtx));
        }
        //Get next line
        data = queue[queuecounter];
        queuecounter++;
    }

    // Log and exit
    fprintf(stderr, "\tThread %d analyzed %lu characters.\n", qnum, totallength);
    pthread_exit(NULL);
}


int  threadtinit(FILE *fp, int numt, thread_t *threaddata, int maxseqs)
{
    threaddata->fp = fp;
    //Determine maximum thread queue length
    threaddata->maxqueue = ceil((float)maxseqs/numt);
    threaddata->maxseqs = threaddata->maxqueue*numt;
    fprintf(stderr, "max lines to load: %d\n", threaddata->maxseqs);
    fprintf(stderr, "max lines per thread: %d\n", threaddata->maxqueue);
    threaddata->numt = numt;
    //Allocate data for queues and initilize them
    threaddata->queue = malloc(numt*sizeof(char *));
    threaddata->threadidx = malloc(numt*sizeof(int));
    for (int i = 0; i < numt; i++) {
        threaddata->queue[i] = malloc(threaddata->maxqueue*sizeof(char *));
        threaddata->threadidx[i] = i;
    }
    //Initialize syncronization data
    pthread_mutex_init(&(threaddata->mtx), NULL);
    pthread_cond_init(&(threaddata->workcond), NULL);
    pthread_cond_init(&(threaddata->readcond), NULL);
    threaddata->gowork = 0;
    threaddata->goread = 0;
    threaddata->doneq = 0;
    return 1;
}


void threadtkill(thread_t *threaddata)
{
    fclose(threaddata->fp);
    for (int i = 0; i < threaddata->numt; i++) {
        free(threaddata->queue[i]);
    }
    free(threaddata->queue);
    free(threaddata->threadidx);
    pthread_mutex_destroy(&(threaddata->mtx));
}

这条线看起来很可疑:

if (length == ltracker) ltracker++;

我通常希望看到:

if (length == ltracker) ltracker = 0; /* wrap */

但没有整个上下文,它有点模糊。 此外,很明显,您正在通过所有这些创建生产者和消费者之间的竞争,这可能比您当前的问题更难调试。

既然你去了三级; 您确实认识到您的缓冲区空间是 O(n^3); 并且 free() 很少会缩小您的进程 memory。 Free 通常只允许你回收之前分配的堆; 所以你的程序会增长,直到它不再需要向系统询问更多的 memory,然后保持这个大小。

请注意,以下内容仅关注您称为readerconsumer线程的代码片段,尽管正如评论中指出的那样,以及其他答案,还有其他潜在的来源应该审查问题......

在您的reader thread中:

while (loaddata(data) >= 0) {
    char *datapoint = malloc(data->legth); 
    ...
    // Note: there are no free(datapoint); calls in this loop
}

显然, datapoint是在这个块内创建的,但不会在这个块内释放。

以下是导致 memory 泄漏的可能因素:

  • 因为reader thread中的datapoint实例是在块内创建的,所以它的生命只存在于这些块内。 在该地址创建的 memory 继续由创建它的进程拥有,但在该块之外指向该 memory 的指针变量不再存在,因此无法在该块之外释放。 而且因为我在该块内没有看到对free(datapopint)的调用,所以它永远不会被释放。

  • 复合这一点, char *datapoint = malloc(data->legth); 在循环中调用,(中间不调用free )每次在新地址创建新的 memory,同时覆盖引用其前任的地址,因此无法释放所有先前的分配。

  • consumer thread中的数据点实例虽然与reader thread中的符号相同,但并未指向相同的datapoint空间。 因此,即使该变量释放,它也不会释放reader thread中存在的datapoint实例。

consumer thread的代码摘录

datapoint = dataqueue[datatracker]  //Note no ";" making this code uncompilable
                                    //forcing the conclusion that code posted
                                    //is not code actually used, 
                                    //Also: where is this instance of datapoint
                                    //created ?
datatracker++;
while (datapoint != NULL) {
    //Do work on data
    free(datapoint);
    datapoint = dataqueue[datatracker];
    datatracker++;

    //More synchronization code
}

根据评论中的问题和一般 Linux 线程信息:
为什么 Valgrind 不检测 memory 泄漏,所以问题
在线程之间传递数据问题
Linux教程中创建线程
Linux教程:POSIX 线程

结果证明我的代码本身没有问题。在 malloc() 释放堆上的 memory 后调用 free() 以供程序重用,但这并不意味着它会回到系统 其原因仍然有点超出我的理解。

Valgrind 没有报告 memory 泄漏,因为没有。

在做穹顶研究之后,阅读更多关于动态 memory 分配和登陆的性质:

Force free() 将 malloc memory 返回到操作系统

为什么 free() function 不返回 memory 给操作系统?

malloc 实现会将释放的 memory 返回系统吗?

调用 free() 后未释放 Memory

每次释放后调用 malloc_trim() 足以使系统回收分配的 memory。

例如,在不调用 malloc_trim() 的情况下,我的程序的 CPU 和 memory 用法如下所示: 在此处输入图像描述 在每次调用我的“阅读器”线程(CPU 使用量的第一个峰值)时,都会分配一些 memory。 调用 mu“消费者”线程释放请求的 memory 但 memory 并不总是按照 plot 中的蓝线返回到系统。

在每个 free() 之后使用 malloc_trim(),memory 的使用看起来像我期望的那样: 在此处输入图像描述 当“阅读器”线程正在执行时,与进程相关的 memory 会增加。 当“消费者”运行时,memory 被释放并返回给操作系统。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM