简体   繁体   English

Valgrind:REALLOC 未初始化的值是由堆分配创建的

[英]Valgrind: REALLOC Uninitialised value was created by a heap allocation

Please, after reading and trying to apply solutions found on stackOverflow the problem has not been solved.请在阅读并尝试应用在 stackOverflow 上找到的解决方案后,问题尚未解决。

Conditional jump or move depends on uninitialised value(s): Conditional jump or move depends on uninitialised value(s): Uninitialised value was created by a heap allocation.条件跳转或移动取决于未初始化的值:条件跳转或移动取决于未初始化的值:未初始化的值是由堆分配创建的。

Error popped up by Valgrind: Valgrind 弹出错误:

error错误

I'm trying to implement file reading line by line and dynamically realloc an array for them.我正在尝试逐行实现文件读取并为它们动态重新分配数组。

Error on line : result = realloc(result, currLen * sizeof(char *));在线错误:result = realloc(result, currLen * sizeof(char *));


void readFile(char *fileName) {
    FILE *fp = NULL;
    size_t len = 0;

    int currLen = 2;
    char **result = calloc(currLen, sizeof(char *));

    fp = fopen(fileName, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    if (result == NULL)
        exit(EXIT_FAILURE);

    int i = 0;
    while (getline(&(*(result + i)), &len, fp) != -1) {
        if (i >= currLen - 1) {
            currLen *= 2;
            result = realloc(result, currLen * sizeof(char *));
        }
        ++i;
    }

    fclose(fp);

    for (int j = 0; j < currLen; ++j) {
        free(*(result + j));
    }

    free(result);
    result = NULL;
}

int main() {
    readFile("");

    exit(EXIT_SUCCESS);
}

In the original posted code, result undergoes an initial allocation via calloc , which zero-initializes the content therein, and being pointers, null-initializes.在原始发布的代码中, result通过calloc进行初始分配,它将其中的内容零初始化,并且作为指针,进行空初始化。 Later on, when expanding the sequence via realloc, no such affordance is taken.稍后,当通过 realloc 扩展序列时,就没有这样的可供性了。 In effect if the original array looked like this:实际上,如果原始数组看起来像这样:

[ NULL, NULL ]

and after adding two elements, looks like this:添加两个元素后,如下所示:

[ addr1, addr2 ]

the realloc kicks in and gives you this : realloc 启动并为您提供

[ addr1, addr2, ????, ???? ]

Adding salt to the wound, getline also requires the length argument being reflective of the allocation size present in the line.在伤口上加盐, getline还需要长度参数反映行中存在的分配大小。 But you're carrying over the length from the prior loop iteration, so not only is the pointer wrong after the first expansion, the length is never correct after the first invocation of getline (leading to your actual crash; the rest of the problems are just not something you saw yet).但是您从先前的循环迭代中继承了长度,因此不仅在第一次扩展后指针错误,而且在第一次调用getline后长度永远不会正确(导致您的实际崩溃;问题的 rest 是只是你还没有看到)。

Solving all of this解决这一切

  1. Use a separate pointer and length for each iteration,每次迭代使用单独的指针和长度,
  2. Ensure they're properly initialized to null,0 before the getline call确保在getline调用之前将它们正确初始化为 null,0
  3. If you read the line successfully, then expand the line pointer buffer.如果成功读取行,展开行指针缓冲区。
  4. Store the pointer, discard the length, and reset both to null,0 before the next iteration.存储指针,丢弃长度,并在下一次迭代之前将两者都重置为 null,0。

In practice, it looks like this:在实践中,它看起来像这样:

#define _POSIX_C_SOURCE  200809L
#include <stdio.h>
#include <stdlib.h>

char **readFile(const char *fileName, size_t *lines)
{
    FILE *fp = fopen(fileName, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    // initially empty, no size or capacity
    char **result = NULL;
    size_t size = 0;
    size_t capacity = 0;

    size_t len = 0;
    char *line = NULL;
    while (getline(&line, &len, fp) != -1)
    {
        if (size == capacity)
        {
            size_t new_capacity = (capacity ? 2 * capacity : 1);
            void *tmp = realloc(result, new_capacity * sizeof *result);
            if (tmp == NULL)
            {
                perror("Failed to expand lines buffer");
                exit(EXIT_FAILURE);
            }

            // recoup the expanded buffer and capacity
            result = tmp;
            capacity = new_capacity;
        }

        result[size++] = line;

        // reset these to NULL,0. they trigger the fresh allocation
        //  and size storage on the next iteration.
        line = NULL;
        len = 0;
    }

    // if getline allocated a buffer on the failure case
    //  get rid of it (didn't see that coming).
    if (line)
        free(line);

    fclose(fp);

    *lines = size;
    return result;
}

int main()
{
    size_t count = 0;
    char **lines = readFile("/usr/share/dict/words", &count);
    if (lines)
    {
        for (size_t i = 0; i < count; ++i)
        {
            fputs(lines[i], stdout);
            free(lines[i]);
        }

        free(lines);
    }

    return 0;
}

On a stock Linux/Mac system /usr/share/dict/words contains about a quarter-million words in the English language.在现有的 Linux/Mac 系统上,/usr/share/dict/words 包含大约 25 万个英语单词。 On my stock Mac, its 235886 (yours will vary).在我的库存 Mac 上,它的 235886(你的会有所不同)。 The callers gets the line pointer and the count, and is responsible for freeing the content therein.调用者获取行指针和计数,并负责释放其中的内容。

Output Output

A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
.... a ton of lines omitted ....
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton

Valgrind Summary Valgrind 总结

==17709== 
==17709== HEAP SUMMARY:
==17709==     in use at exit: 0 bytes in 0 blocks
==17709==   total heap usage: 235,909 allocs, 235,909 frees, 32,506,328 bytes allocated
==17709== 
==17709== All heap blocks were freed -- no leaks are possible
==17709== 

Alternative: Let getline reuse its buffer替代方案:让getline重用其缓冲区

There is no guarantee the buffer getline allocates matches the line length efficiently.不能保证getline分配的缓冲区有效地匹配行长度。 In fact, the only guarantee is, on successful execution, the function returns the number of chars including the delimiter (but not the terminator), and the memory holds that data.事实上,唯一的保证是,在成功执行时,function 返回包含分隔符(但不包括终止符)的字符数,并且 memory 保存该数据。 The actual allocation size could be considerably more than that, and that space is effectively wasted.实际的分配大小可能远不止于此,并且实际上浪费了空间。

To demonstrate this, consider the following.为了证明这一点,请考虑以下内容。 The same code, but we do NOT reset the buffer on each loop, and rather than store its pointer directly, we store a strdup of the line.相同的代码,但我们不会在每个循环中重置缓冲区,而不是直接存储其指针,我们存储该行的strdup Note this only works if the line does not contain embedded null chars.请注意,这仅在该行包含嵌入的 null 字符时才有效。 This allows getline to reuse its buffer, and only expand if needed, for each read.这允许getline重用其缓冲区,并且仅在需要时为每次读取扩展。 We're responsible for making the actual copy of the line data (and we do using POSIX strdup ).我们负责制作行数据的实际副本(我们使用 POSIX strdup来做)。 When executed there are still no leaks, but note the valgrind summary, specifically the number of bytes allocated in comparison to the number of bytes from the previous version above.执行时仍然没有泄漏,但请注意 valgrind 总结,特别是分配的字节数与上述先前版本的字节数相比。

char **readFile(const char *fileName, size_t *lines)
{
    FILE *fp = fopen(fileName, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    // initially empty, no size or capacity
    char **result = NULL;
    size_t size = 0;
    size_t capacity = 0;

    size_t len = 0;
    char *line = NULL;
    while (getline(&line, &len, fp) != -1)
    {
        if (size == capacity)
        {
            size_t new_capacity = (capacity ? 2 * capacity : 1);
            void *tmp = realloc(result, new_capacity * sizeof *result);
            if (tmp == NULL)
            {
                perror("Failed to expand lines buffer");
                exit(EXIT_FAILURE);
            }

            // recoup the expanded buffer and capacity
            result = tmp;
            capacity = new_capacity;
        }

        // make copy here. let getline reuse 'line'
        result[size++] = strdup(line);
    }

    // free whatever was left
    if (line)
        free(line);

    fclose(fp);

    *lines = size;
    return result;
}

Valgrind Summary Valgrind 总结

==17898== 
==17898== HEAP SUMMARY:
==17898==     in use at exit: 0 bytes in 0 blocks
==17898==   total heap usage: 235,909 allocs, 235,909 frees, 6,929,003 bytes allocated
==17898== 
==17898== All heap blocks were freed -- no leaks are possible
==17898== 

The number of allocations is the same (which tells us getline allocated a large enough buffer up front to never need expansion), but the actual total allocated space is considerably more efficient, as now we are storing strings in buffers allocated to match their length;分配的数量是相同的(这告诉我们getline分配了足够大的缓冲区,永远不需要扩展),但实际分配的总空间要高效得多,因为现在我们将字符串存储在分配的缓冲区中以匹配它们的长度; not whatever getline stood up as a read buffer.不是任何getline站起来作为读取缓冲区。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM