简体   繁体   English

valgrind - hellgrind 与泄漏检查的不同结果

[英]valgrind - different results with hellgrind vs leak-check

I have some strange behaviour that I do not understand.我有一些我不明白的奇怪行为。 The code is a bit complex so I would refrain from posting it here and instead describe the behaviour and hope that somebody, knowing how valgrind works, has an idea that I can pursue despite this little information.代码有点复杂,所以我不会在这里发布它,而是描述行为,并希望有人知道 valgrind 的工作原理,尽管信息很少,但我可以追求一个想法

Background:背景:

I am developing some additional functionality for an open-source, c/c++ based agent-based modelling platform fork @ my github .我正在为开源的、基于 c/c++ 的基于代理的建模平台fork @ my github开发一些附加功能。 Compilation is fine.编译没问题。 Everything seems to work as it should so far based on my validation with test-programs.根据我对测试程序的验证,一切似乎都正常工作。 Also, valgrind does not report any errors of relevance.此外,valgrind 不会报告任何相关错误。 But, reproducability (which is crucial) is strange.但是,可重复性(这是至关重要的)很奇怪。

Within the framework one defines a model file (initialisation of a simulation run, basically).在框架内定义一个模型文件(基本上是模拟运行的初始化)。 Based on this file, one should be able to reproduce the exact same output (and platform independent).基于此文件,应该能够重现完全相同的输出(并且与平台无关)。 In a way this works: If I start the simulation environment (GUI version), load the file and run it, it produces the same result each time.在某种程度上这是有效的:如果我启动模拟环境(GUI 版本),加载文件并运行它,它每次都会产生相同的结果。 Also, using the command-line version, I get the same results each time.此外,使用命令行版本,我每次都得到相同的结果。

But, if, from a running instance of the simulation environment, I run the same model more than once, then the strange behavior occurs - sometimes...但是,如果从模拟环境的一个运行实例中,我不止一次运行同一个模型,那么就会出现奇怪的行为——有时......

Compiler options used:使用的编译器选项:

CC=g++
GLOBAL_CC=-march=native -std=gnu++14
SSWITCH_CC=-fnon-call-exceptions -Og -ggdb3 -Wall

The set-up:设置:

I run the compiled file and, internally to the program compiled, a fixed simulation set-up three times.我运行编译的文件,并在编译的程序内部运行固定的模拟设置三遍。 Now, it should produce the exact same results each time, which I check by printing random numbers at different stages.现在,它每次都应该产生完全相同的结果,我通过在不同阶段打印随机数来检查。

The strange behaviour:奇怪的行为:

Option #1:选项1:

When I run the program in valgrind using the options:当我使用以下选项在 valgrind 中运行程序时:

valgrind --leak-check=full --leak-resolution=high --show-reachable=yes

I do not get the same results internally我在内部没有得到相同的结果

Report from Option 1:来自选项 1 的报告:

Finished processing sim1
==6206==
==6206== HEAP SUMMARY:
==6206==     in use at exit: 43 bytes in 1 blocks
==6206==   total heap usage: 4,124,309 allocs, 4,124,308 frees, 888,390,511 bytes allocated
==6206==
==6206== 43 bytes in 1 blocks are still reachable in loss record 1 of 1
==6206==    at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==6206==    by 0x5BE7FB2: getcwd (getcwd.c:84)
==6206==    by 0x143391: lsdmain(int, char**) (lsdmain.cpp:203)
==6206==    by 0x10C37D: main (main_gnuwin.cpp:29)
==6206==
==6206== LEAK SUMMARY:
==6206==    definitely lost: 0 bytes in 0 blocks
==6206==    indirectly lost: 0 bytes in 0 blocks
==6206==      possibly lost: 0 bytes in 0 blocks
==6206==    still reachable: 43 bytes in 1 blocks
==6206==         suppressed: 0 bytes in 0 blocks
==6206==
==6206== For counts of detected and suppressed errors, rerun with: -v
==6206== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Option #2选项#2

However, when I use the following option instead:但是,当我使用以下选项时:

valgrind --tool=helgrind

I do get the same results each time with the command line version.每次使用命令行版本时,我都会得到相同的结果。 Interestingly, the first results with option #1 are the same as the results with option #2.有趣的是,选项#1 的第一个结果与选项#2 的结果相同。

I would be happy for any suggestions.我很乐意提供任何建议。 And, I am not a trained computer scientist... I am using and mt1937 (reinitialised each time) - but the initial random numbers between the simulations are the same, so I do not think the error resides here.而且,我不是受过训练的计算机科学家......我正在使用和 mt1937(每次重新初始化) - 但模拟之间的初始随机数是相同的,所以我认为错误不存在于此。 Although later within the run the random numbers change in Option #1 (this is my test, besides the time the simulation needs to find an equilibrium).尽管在运行后期,选项 #1 中的随机数会发生变化(这是我的测试,除了模拟需要找到平衡的时间)。

Finally, I could find the issue: At two points in the program I sort a temporary vector with pairs of distance values and pointers of objects located on a 2d space:最后,我找到了问题:在程序的两个点上,我对一个临时向量进行排序,其中包含距离值对和位于 2d 空间上的对象指针:

std::sort( vector.begin(),vector.end() ); // vector of std::pairs<double, pointer>

The solution, obviously, is to only sort by the first item of the pair:显然,解决方案是仅按对的第一项排序:

std::sort( vector.begin(),vector.end(), [](auto const &A, auto const &B ){return A.first < B.first; } );

Some remarks on why I did not find this issue directly:关于为什么我没有直接发现这个问题的一些评论:

  • When I implemented this sort, I intended to make it "stable".当我实现这种排序时,我打算使它“稳定”。 The pointers of the objects are kind of unique, thus in different subsets the ordering would be the same and also independently of how I add the items to the set.对象的指针是独一无二的,因此在不同的子集中,排序是相同的,并且与我如何将项目添加到集合中的方式无关。
  • I did not consider that pointer values are (not precisely, but in effect) random numbers outside of my control.我不认为指针值是(不完全是,但实际上)不受我控制的随机数。
  • I did not see this, because somehow the OS (or whatever) always assigns the same pointer values between different calls of the program (I suggest there is a "virtual" space that is always initialized again).我没有看到这一点,因为操作系统(或其他任何东西)总是在程序的不同调用之间分配相同的指针值(我建议有一个总是再次初始化的“虚拟”空间)。 Because of this, I did not suggest that pointers were the issue.因此,我没有建议指针是问题所在。
  • Curiously, when I ran the program with Valgrind and --tool=helgrind option, the issue did not persist.奇怪的是,当我使用 Valgrind 和--tool=helgrind选项运行程序时,问题并没有持续存在。 One suggestion I got (offline) was that memcheck preinitialises the memory with a given pattern, this would have been an answer if uninitialised variables had been the cause.我得到的一个建议(离线)是 memcheck 用给定的模式预初始化内存,如果未初始化的变量是原因,这将是一个答案。 As it seems, helgrind also controls the memory in different scopes, providing for each of my subsequent simulations a "fresh" virtual memory such that my pointer sorting was stable in the repeated loop.看起来,helgrind 还控制不同范围内的内存,为我随后的每个模拟提供一个“新鲜”的虚拟内存,这样我的指针排序在重复循环中是稳定的。

I hope this helps somebody if he or she runs into the same problems.我希望这对遇到同样问题的人有所帮助。 Thanks for all the suggestions!感谢所有的建议!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM