如何在Rust中调试内存问题？

Question

I hope this question isn't too open-ended. 我希望这个问题不是太开放。 I ran into a memory issue with Rust, where I got an "out of memory" from calling next on an Iterator trait object . 我遇到了Rust的一个内存问题，在Iterator trait对象上调用next时出现“内存不足” 。 I'm unsure how to debug it. 我不确定如何调试它。 Prints have only brought me to the point where the failure occurs. 打印只会让我发生故障。 I'm not very familiar with other tools such as ltrace , so although I could create a trace (231MiB, pff), I didn't really know what to do with it. 我对ltrace等其他工具不是很熟悉，所以虽然我可以创建一个跟踪（231MiB，pff），但我真的不知道如何处理它。 Is a trace like that useful? 这样的痕迹有用吗？ Would I do better to grab gdb/lldb? 抓住gdb / lldb会更好吗？ Or Valgrind? 还是Valgrind？

Answer 1

In general I would try to do the following approach: 一般来说，我会尝试采用以下方法：

Boilerplate reduction: Try to narrow down the problem of the OOM, so that you don't have too much additional code around. Boilerplate减少：尝试缩小OOM的问题，以便您没有太多额外的代码。 In other words: the quicker your program crashes, the better. 换句话说：程序崩溃越快越好。 Sometimes it is also possible to rip out a specific piece of code and put it into an extra binary, just for the investigation. 有时也可以删除特定的代码片段并将其放入额外的二进制代码中，仅用于调查。
Problem size reduction: Lower the problem from OOM to a simple "too much memory" so that you can actually tell the some part wastes something but that it does not lead to an OOM. 问题规模缩小：将问题从OOM降低到简单的“太多内存”，这样你实际上可以告诉某些部分浪费了一些东西，但它不会导致OOM。 If it is too hard to tell wether you see the issue or not, you can lower the memory limit. 如果您很难分辨出问题，可以降低内存限制。 On Linux, this can be done using ulimit : 在Linux上，这可以使用ulimit完成：
```
 ulimit -Sv 500000 # that's 500MB ./path/to/exe --foo 
```
Information gathering: If you problem is small enough, you are ready to collect information which has a lower noise level. 信息收集：如果您的问题足够小，您就可以收集噪音水平较低的信息。 There are multiple ways which you can try. 您可以尝试多种方式。 Just remember to compile your program with debug symbols. 请记住使用调试符号编译程序。 Also it might be an advantage to turn off optimization since this usually leads to information loss. 关闭优化也可能是有利的，因为这通常会导致信息丢失。 Both can be archived by NOT using the --release flag during compilation. 两者都可以在编译期间使用--release标志进行存档。
- Heap profiling: One way is too use gperftools : 堆分析：一种方法是使用gperftools ：
```
 LD_PRELOAD="/usr/lib/libtcmalloc.so" HEAPPROFILE=/tmp/profile ./path/to/exe --foo pprof --gv ./path/to/exe /tmp/profile/profile.0100.heap 
```
  This shows you a graph which symbolizes which parts of your program eat which amount of memory. 这会向您显示一个图形，它表示程序的哪些部分会占用多少内存。 See official docs for more details. 有关详细信息，请参阅官方文档。
- rr: Sometimes it's very hard to figure out what is actually happening, especially after you created a profile. rr：有时很难弄清楚实际发生了什么，特别是在你创建了一个配置文件之后。 Assuming you did a good job in step 2, you can use rr : 假设你在第2步中做得很好，你可以使用rr ：
```
 rr record ./path/to/exe --foo rr replay 
```
  This will spawn a GDB with superpowers. 这将产生一个超级大国的GDB。 The difference to a normal debug session is that you can not only continue but also reverse-continue . 与正常调试会话的不同之处在于，您不仅可以continue ，还可以reverse-continue 。 Basically your program is executed from a recording where you can jump back and forth as you want. 基本上，您的程序是从录制中执行的，您可以根据需要来回跳转。 This wiki page provides you some additional examples. 此Wiki页面为您提供了一些其他示例。 One thing to point out is that rr only seems to work with GDB. 有一点需要指出的是，rr似乎只适用于GDB。
- Good old debugging: Sometimes you get traces and recordings that are still way too large. 良好的旧调试：有时你会得到仍然太大的痕迹和录音。 In that case you can (in combination with the ulimit trick) just use GDB and wait until the program crashes: 在这种情况下，你可以（结合ulimit技巧）只使用GDB并等到程序崩溃：
```
 gdb --args ./path/to/exe --foo 
```
  You now should get a normal debugging session where you can examine what the current state of the program was. 您现在应该获得一个正常的调试会话，您可以在其中检查程序的当前状态。 GDB can also be launched with coredumps. GDB也可以使用coredumps启动。 The general problem with that approach is that you cannot go back in time and you cannot continue with execution. 这种方法的一般问题是你不能回到过去，你不能继续执行。 So you only see the current state including all stack frames and variables. 所以你只能看到包括所有堆栈帧和变量的当前状态。 Here you could also use LLDB if you want. 如果需要，您也可以在这里使用LLDB。
(Potential) fix + repeat: After you have a glue what might go wrong you can try to change your code. （潜在）修复+重复：你有一个胶水后可能会出错，你可以尝试更改你的代码。 Then try again. 然后再试一次。 If it's still not working, go back to step 3 and try again. 如果仍然无法正常工作，请返回步骤3再试一次。

Answer 2

Valgrind and other tools work fine, and should work out of the box as of Rust 1.32. Valgrind和其他工具工作正常，应该从Rust 1.32开箱即用。 Earlier versions of Rust require changing the global allocator from jemalloc to the system's allocator so that Valgrind and friends know how to monitor memory allocations. 早期版本的Rust需要将全局分配器从jemalloc更改为系统的分配器，以便Valgrind和朋友知道如何监视内存分配。

In this answer, I use the macOS developer tool Instruments, as I'm on macOS, but Valgrind / Massif / Cachegrind work similarly. 在这个答案中，我使用macOS开发工具Instruments，因为我在macOS上，但Valgrind / Massif / Cachegrind的工作方式类似。

Example: An infinite loop 示例：无限循环

Here's a program that "leaks" memory by pushing 1MiB String s into a Vec and never freeing it: 这是一个通过将1MiB String推入Vec并从不释放它来“泄漏”内存的程序：

use std::{thread, time::Duration};

fn main() {
    let mut held_forever = Vec::new();
    loop {
        held_forever.push("x".repeat(1024 * 1024));
        println!("Allocated another");

        thread::sleep(Duration::from_secs(3));
    }
}

You can see memory growth over time, as well as the exact stack trace that allocated the memory: 您可以看到内存随时间的增长，以及分配内存的确切堆栈跟踪：

Example: Cycles in reference counts 示例：引用计数中的循环

Here's an example of leaking memory by creating an infinite reference cycle: 以下是通过创建无限参考周期来泄漏内存的示例：

use std::{cell::RefCell, rc::Rc};

struct Leaked {
    data: String,
    me: RefCell<Option<Rc<Leaked>>>,
}

fn main() {
    let data = "x".repeat(5 * 1024 * 1024);

    let leaked = Rc::new(Leaked {
        data,
        me: RefCell::new(None),
    });

    let me = leaked.clone();
    *leaked.me.borrow_mut() = Some(me);
}

See also: 也可以看看：

Answer 3

In general, to debug, you can use either a log-based approach (either by inserting the logs yourself, or having a tool such a ltrace , ptrace , ... to generate the logs for you) or you can use a debugger. 通常，要进行调试，您可以使用基于日志的方法（通过自己插入日志，或者使用ltrace ， ptrace等工具为您生成日志），也可以使用调试器。

Note that ltrace , ptrace or debugger-based approaches require that you be able to reproduce the problem; 请注意， ltrace ， ptrace或调试器的方法要求您能够重现问题; I tend to favor manual logs because I work in an industry where bug reports are generally too imprecise to allow immediate reproduction (and thus we use logs to create the reproducer scenario). 我倾向于使用手动日志，因为我在一个行业中工作，其中错误报告通常太不精确，无法立即复制（因此我们使用日志来创建复制器方案）。

Rust supports both approaches, and the standard toolset that one uses for C or C++ programs works well for it. Rust支持这两种方法，并且用于C或C ++程序的标准工具集适用于它。

My personal approach is to have some logging in place to quickly narrow down where the issue occurs, and if logging is insufficient to fire up a debugger for a more fine-combed inspection. 我个人的方法是进行一些日志记录，以便快速缩小问题发生的位置，并且如果日志记录不足以启动调试器以进行更精细的检查。 In this case I would recommend going straight away for the debugger. 在这种情况下，我建议立即去调试器。

A panic is generated, which means that by breaking on the call to the panic hook, you get to see both the call stack and memory state at the moment where things go awry. 生成panic ，这意味着通过打破对恐慌钩子的调用，您可以在出现问题时看到调用堆栈和内存状态。

Launch your program with the debugger, set a break point on the panic hook, run the program, profit. 使用调试器启动程序，在恐慌钩子上设置断点，运行程序，获利。

如何在Rust中调试内存问题？

问题描述

3 个解决方案

解决方案1
5 已采纳 2016-07-08 17:29:53

解决方案2
4 2019-01-30 13:43:43

Example: An infinite loop 示例：无限循环

Example: Cycles in reference counts 示例：引用计数中的循环

解决方案3
3 2016-07-08 09:13:44

如何在Rust中调试内存问题？

问题描述

3 个解决方案

解决方案1 5 已采纳 2016-07-08 17:29:53

解决方案2 4 2019-01-30 13:43:43

Example: An infinite loop 示例：无限循环

Example: Cycles in reference counts 示例：引用计数中的循环

解决方案3 3 2016-07-08 09:13:44

解决方案1
5 已采纳 2016-07-08 17:29:53

解决方案2
4 2019-01-30 13:43:43

解决方案3
3 2016-07-08 09:13:44