简体   繁体   English

使用regex.h时内存泄漏?

[英]Memory leak when using regex.h?

Minimal code example is as follows: 最小代码示例如下:

#include <cstdlib>
#include <iostream>
#include <vector>
#include <regex.h>

using namespace std;

class regex_result {
public:
    /** Contains indices of starting positions of matches.*/
    std::vector<int> positions;
    /** Contains lengths of matches.*/
    std::vector<int> lengths;
};

regex_result match_regex(string regex_string, const char* string) {
    regex_result result;
    regex_t* regex = new regex_t;
    regcomp(regex, regex_string.c_str(), REG_EXTENDED);
    /* "P" is a pointer into the string which points to the end of the
       previous match. */
    const char* pointer = string;
    /* "n_matches" is the maximum number of matches allowed. */
    const int n_matches = 10;
    regmatch_t matches[n_matches];
    int nomatch = 0;
    while (!nomatch) {
        nomatch = regexec(regex, pointer, n_matches, matches, 0);
        if (nomatch)
            break;
        for (int i = 0; i < n_matches; i++) {
            int start,
                finish;
            if (matches[i].rm_so == -1) {
                break;
            }
            start = matches[i].rm_so + (pointer - string);
            finish = matches[i].rm_eo + (pointer - string);
            result.positions.push_back(start);
            result.lengths.push_back(finish - start);
        }
        pointer += matches[0].rm_eo;
    }
    delete regex;
    return result;
}

int main(int argc, char** argv) {
    string str = "this is a test";
    string pat = "this";
    regex_result res = match_regex(pat, str.c_str());
    cout << res.positions.size() << endl;
    return 0;
}

So I have written a function that parses a given string for regular expression matches. 所以我编写了一个函数来解析给定字符串的正则表达式匹配。 The result is held in a class that is essentially two vectors, one for the positions of the matches and one for the corresponding match lengths. 结果保存在一个基本上是两个向量的类中,一个用于匹配的位置,一个用于相应的匹配长度。

This works fine, but when I ran valgrind over it, it shows some substantial memory leaks. 这工作正常,但是当我在它上面运行valgrind时,它显示了一些实质性的内存泄漏。

When using valgrind --leak-check=full on the code above I get: 在上面的代码中使用valgrind --leak-check=full时,我得到:

==24843== Memcheck, a memory error detector
==24843== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24843== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==24843== Command: ./test
==24843== 
1
==24843== 
==24843== HEAP SUMMARY:
==24843==     in use at exit: 11,688 bytes in 37 blocks
==24843==   total heap usage: 54 allocs, 17 frees, 12,868 bytes allocated
==24843== 
==24843== 256 bytes in 1 blocks are definitely lost in loss record 14 of 18
==24843==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24843==    by 0x543549A: regcomp (regcomp.c:487)
==24843==    by 0x400ED0: match_regex(std::string, char const*) (in <path>)
==24843==    by 0x4010CA: main (in <path>)
==24843== 
==24843== 11,432 (224 direct, 11,208 indirect) bytes in 1 blocks are definitely lost in     loss record 18 of 18
==24843==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24843==    by 0x4C2CF1F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24843==    by 0x5434BAF: re_compile_internal (regcomp.c:760)
==24843==    by 0x54354FF: regcomp (regcomp.c:506)
==24843==    by 0x400ED0: match_regex(std::string, char const*) (in <path>)
==24843==    by 0x4010CA: main (in <path>)
==24843== 
==24843== LEAK SUMMARY:
==24843==    definitely lost: 480 bytes in 2 blocks
==24843==    indirectly lost: 11,208 bytes in 35 blocks
==24843==      possibly lost: 0 bytes in 0 blocks
==24843==    still reachable: 0 bytes in 0 blocks
==24843==         suppressed: 0 bytes in 0 blocks
==24843== 
==24843== For counts of detected and suppressed errors, rerun with: -v
==24843== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

Is my code wrong or is there really a bug in those files? 我的代码是错误的还是这些文件中确实存在错误?

Your regex_t management is not required to be dynamic, and though that isn't directly related to you problem, it is a little odd. 您的regex_t管理不需要是动态的,虽然这与您的问题没有直接关系,但这有点奇怪。 The real problem is you never regfree() your resulting expression if compiled successfully (which you should verify). 真正的问题是如果编译成功 (你应该验证),你永远不会regfree()你的结果表达式。 You should setup your regular expression like this: 您应该像这样设置正则表达式:

regex_t regex;
int res = regcomp(&regex, regex_string.c_str(), REG_EXTENDED);
if (res == 0)
{
    // use your expression via &regex
    ....

    // and eventually free it when done.
    regfree(&regex);
}

If your implementation supports them, I strongly advise using the C++11 provided <regex> library, as it has nice RAII solutions to much of this. 如果您的实现支持它们,我强烈建议您使用C ++ 11提供的<regex>库,因为它具有很好的RAII解决方案。

您必须调用regfree()来释放由regcomp()分配的内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM