为什么Rust优化器不会删除那些无用的指令（在Godbolt Compiler Explorer上测试）？

Question

I wanted to take a look at the assembly output for a tiny Rust function: 我想看看一个小的Rust函数的程序集输出：

pub fn double(n: u8) -> u8 {
    n + n
}

I used the Godbolt Compiler Explorer to generate and view the assembly (with the -O flag, of course). 我使用Godbolt Compiler Explorer生成并查看程序集（当然使用-O标志）。 It shows this output: 它显示了这个输出：

example::double:
    push    rbp
    mov     rbp, rsp
    add     dil, dil
    mov     eax, edi
    pop     rbp
    ret

Now I'm a bit confused, as there are a few instructions that doesn't seem to do anything useful: push rbp , mov rbp, rsp and pop rbp . 现在我有点困惑，因为有一些指令似乎没有做任何有用的东西： push rbp ， mov rbp, rsp和pop rbp 。 From what I understand, I would think that executing these three instructions alone doesn't have any side effects. 根据我的理解，我认为单独执行这三个指令没有任何副作用。 So why doesn't the Rust optimizer remove those useless instructions? 那么为什么Rust优化器不会删除那些无用的指令呢？

For comparisons, I also tested a C++ version : 为了比较，我还测试了一个C ++版本：

unsigned char doubleN(unsigned char n) {
    return n + n;
}

Assembly output (with -O flag): 程序集输出（带-O标志）：

doubleN(unsigned char): # @doubleN(unsigned char)
    add dil, dil
    mov eax, edi
    ret

And in fact, here those "useless" instructions from above are missing, as I would expect from an optimized output. 事实上，这里缺少上面那些“无用的”指令，正如我对优化输出所期望的那样。

Answer 1

The short answer: Godbolt adds a -C debuginfo=1 flag which forces the optimizer to keep all instructions managing the frame pointer . 简短的回答： Godbolt添加了一个-C debuginfo=1标志，强制优化器保持管理帧指针的所有指令 。 Rust removes those instructions too when compiling with optimization and without debug information. 在使用优化进行编译并且没有调试信息时，Rust也会删除这些指令。

What are those instructions doing? 那些指示在做什么？

These three instructions are part of the function prologue and epilogue . 这三条指令是功能序言和结语的一部分。 In particular, here they manage the so called frame pointer or base pointer ( rbp on x86_64) . 特别是，这里他们管理所谓的帧指针或基指针 （x86_64上的rbp ）。 Note: don't confuse the base pointer with the stack pointer ( rsp on x86_64)! 注意：不要用堆栈指针 （迷惑基指针 rsp在x86_64）！ The base pointer always points inside the current stack frame: 基指针始终指向当前堆栈帧内：

                          ┌──────────────────────┐                         
                          │  function arguments  │                      
                          │         ...          │   
                          ├──────────────────────┤   
                          │    return address    │   
                          ├──────────────────────┤   
              [rbp] ──>   │       last rbp       │   
                          ├──────────────────────┤   
                          │   local variables    │   
                          │         ...          │   
                          └──────────────────────┘

The interesting thing about the base pointer is that it points to a piece of memory in the stack which stores the last value of the rbp . 关于基指针的有趣之处在于它指向堆栈中的一块内存，它存储rbp的最后一个值。 This means that we can easily find out the base pointer of the previous stack frame (the one from the function that called "us"). 这意味着我们可以很容易地找到前一个堆栈帧的基本指针（来自调用“us”的函数中的一个）。

Even better: all base pointers form something similar to a linked list! 更好的是：所有基本指针形成类似于链表的东西！ We can easily follow all last rbp s to walk up the stack. 我们可以轻松地跟随所有last rbp来走上堆栈。 This means that at each point during program execution, we know exactly what functions called what other functions such that we end up "here". 这意味着在程序执行期间的每个点上，我们确切地知道哪些函数称为其他函数，因此我们最终“在这里”。

Let's review the instructions again: 我们再来看看说明：

; We store the "old" rbp on the stack
push    rbp

; We update rbp to hold the new value
mov     rbp, rsp

; We undo what we've done: we remove the old rbp
; from the stack and store it in the rbp register
pop     rbp

What are those instructions good for? 这些说明有什么用？

The base pointer and its "linked list" property are hugely important for debugging and analyzing program behavior in general (eg profiling). 基本指针及其“链表”属性对于调试和分析程序行为（例如分析）非常重要。 Without the base pointer, it's way more difficult to generate a stack trace and to locate the function that is currently executed. 如果没有基指针，生成堆栈跟踪和定位当前执行的函数会更加困难。

Additionally, managing the frame pointer usually doesn't slow things down by a lot. 此外，管理帧指针通常不会减慢很多事情。

Why aren't they removed by the optimizer and how can I enforce it? 为什么它们不被优化器删除，我该如何强制执行呢？

They usually would be, if Godbolt didn't pass -C debuginfo=1 to the compiler . 如果Godbolt没有将-C debuginfo=1传递给编译器，它们通常会是这样。 This instructs the compiler to keep all things related to frame pointer handling, because we need it for debugging. 这指示编译器保留与帧指针处理相关的所有内容，因为我们需要它来进行调试。 Note that frame pointers are not inherently required for debugging -- other kinds of debug info usually suffice. 请注意，调试不一定需要帧指针 - 其他类型的调试信息通常就足够了。 Frame pointers are kept when storing any kind of debug info because there are still a few minor issues related to removing frame pointers in Rust programs. 存储任何类型的调试信息时都会保留帧指针，因为在Rust程序中删除帧指针仍然存在一些小问题。 This is being discussed in this GitHub tracking issue . 这个GitHub跟踪问题正在讨论这个问题。

You can "undo" it by just adding the flag -C debuginfo=0 yourself . 您可以通过自己添加标志-C debuginfo=0来“撤消”它。 This results in exactly the same output as the C++ version: 这导致与C ++版本完全相同的输出：

example::double:
    add     dil, dil
    mov     eax, edi
    ret

You can also test it locally by executing: 您还可以通过执行以下命令在本地测试：

$ rustc -O --crate-type=lib --emit asm -C "llvm-args=-x86-asm-syntax=intel" example.rs

Compiling with optimizations ( -O ) automatically removes the rbp handling if you don't explicitly turn on debug information. 如果不显式打开调试信息，则使用优化（ -O ）进行编译会自动删除rbp处理。

为什么Rust优化器不会删除那些无用的指令（在Godbolt Compiler Explorer上测试）？

问题描述

1 个解决方案

解决方案1
18 已采纳 2017-08-09 09:04:04

What are those instructions doing? 那些指示在做什么？

What are those instructions good for? 这些说明有什么用？

Why aren't they removed by the optimizer and how can I enforce it? 为什么它们不被优化器删除，我该如何强制执行呢？

为什么Rust优化器不会删除那些无用的指令（在Godbolt Compiler Explorer上测试）？

问题描述

1 个解决方案

解决方案1 18 已采纳 2017-08-09 09:04:04

What are those instructions doing? 那些指示在做什么？

What are those instructions good for? 这些说明有什么用？

Why aren't they removed by the optimizer and how can I enforce it? 为什么它们不被优化器删除，我该如何强制执行呢？

解决方案1
18 已采纳 2017-08-09 09:04:04