简体   繁体   English

是什么导致C ++ STL列表节点内存分配出现段错误?

[英]What causes a segfault in C++ STL list node memory allocation?

I've written some C++ code that runs perfectly fine on my laptop PC (compiled under both a Microsoft compiler and g++ under MinGW). 我已经编写了一些可以在笔记本电脑上完美运行的C ++代码(在Microsoft编译器和MinGW下使用g ++编译)。 I am in the process of porting it to a Unix machine. 我正在将其移植到Unix机器上。

I've compiled with both g++ and with Intel's ipcp on the Unix machine and in both cases, my program crashes (segfaults) after running for a while. 我已经在Unix机器上同时使用g ++和Intel的ipcp进行了编译,在这两种情况下,我的程序在运行一段时间后都会崩溃(出现段错误)。 I can run it for a short time without a crash. 我可以在短时间内运行它而不会崩溃。

When I debug, I find that the crash is happening when the program tries to copy an STL list - specifically, it happens when the program tries to allocate memory to create a new node in the list. 调试时,我发现崩溃是在程序尝试复制STL列表时发生的-具体地说,当程序尝试分配内存以在列表中创建新节点时发生崩溃。 And the error I get in the debugger (TotalView) is that "an allocation call failed or the address returned is null." 我在调试器(TotalView)中遇到的错误是“分配调用失败或返回的地址为null”。

The crash does not always happen in the same place in the code each time I run it, but does always happen during an allocation call to create a node in an STL list. 崩溃并非总是在每次运行时都在代码的同一位置发生,而是总是在分配调用期间发生,以在STL列表中创建节点。 I don't think I'm running out of memory. 我不认为我的内存不足。 I have a few memory leaks, but they're very small. 我有一些内存泄漏,但是它们很小。 What else can cause a memory allocation error? 还有什么会导致内存分配错误? And why does it happen on the Unix machine and not on my PC? 为什么会在Unix机器上而不是在我的PC上发生呢?

UPDATE: I used MemoryScape to help debug. 更新:我使用MemoryScape来帮助调试。 When I used guard blocks, the program ran through without crashing, further suggesting a memory issue. 当我使用防护块时,程序运行时没有崩溃,这进一步提示存在内存问题。 What finally worked to nail down the problem was to "paint" allocated memory. 最终解决该问题的方法是“绘制”分配的内存。 It turns out I was initializing a variable, but not setting it to a value before I used it as an array index. 事实证明,我正在初始化变量,但在将其用作数组索引之前未将其设置为值。 The array was therefore overrunning because it was using whatever garbage was in the variable's memory location -- often it was 0 or some other small number, so no problem. 因此,该数组处于超载状态,因为它使用了变量内存位置中的任何垃圾-通常为0或其他一些小数,所以没有问题。 But when I ran the program long enough, it was more likely to hold a larger number and corrupt the heap when I wrote out of bounds of the array. 但是,当我运行程序足够长的时间时,当我写出数组的边界时,它更有可能持有更大的数目并破坏堆。 Painting the allocated memory with a large number forced a segfault right at the line of code where I attempted to write a value in the array and I could see that large painted number being used as the array index. 用较大的数字绘制分配的内存会在我试图在数组中写入值的代码行处发生段错误,并且我可以看到该较大的绘制数字被用作数组索引。

This is likely caused by heap corruption - elsewhere in the code, you're overwriting freed memory, or writing to memory outside the bounds of your memory allocations (buffer overflows, or writing before the start of allocated memory). 这很可能是由堆损坏引起的-在代码的其他地方,您正在覆盖释放的内存,或者在超出内存分配范围的情况下写入内存(缓冲区溢出,或者在分配的内存开始之前写入)。 Heap corruption typically results in crashes at an unrelated location, such as in STL code. 堆损坏通常会在不相关的位置(例如STL代码)中导致崩溃。 Since you're on a unix platform, you should try running your program under valgrind to try to identify the original heap corruption. 由于您使用的是Unix平台,因此应尝试在valgrind下运行程序以尝试识别原始堆损坏。

This sounds like a corruption of the dynamic memory allocation data structures, which is often caused by other, unrelated code. 这听起来像是动态内存分配数据结构的损坏,通常是由其他不相关的代码引起的。 This type of bug is notorious for being hard to find and reproduce without external tools, because any change in memory layout can mask it. 这种臭名昭著的臭名昭著的是,如果没有外部工具就很难找到和重现它,因为内存布局的任何更改都可能掩盖它。 It probably worked through luck in the Windows version. 在Windows版本中,可能运气不错。

Memory debuggers are great tools to catch such corruption. 内存调试器是捕获此类损坏的好工具。 valgrind , dmalloc and efence are very good alternatives to check the correctness of your program. valgrinddmallocefence是检查程序正确性的很好选择。

 I have a few memory leaks, but they're very small.

Well, if you run it for a while, then it ends up being a lot of memory. 好吧,如果您将其运行一段时间,则最终会占用大量内存。 That's kind of the thing about leaks. 这就是关于泄漏的事情。 You should log your memory usage at the point of the crash to see if there was any memory available. 您应该在崩溃时记录内存使用情况,以查看是否有可用的内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM