简体   繁体   English

Rust如何知道是否在堆栈展开期间运行析构函数?

[英]How does Rust know whether to run the destructor during stack unwind?

The documentation for mem::uninitialized points out why it is dangerous/unsafe to use that function: calling drop on uninitialized memory is undefined behavior. mem::uninitialized的文档指出了使用该函数是危险/不安全的原因:对未初始化的内存调用drop是未定义的行为。

So this code should be, I believe, undefined: 因此,我相信这段代码应该是未定义的:

let a: TypeWithDrop = unsafe { mem::uninitialized() };
panic!("=== Testing ==="); // Destructor of `a` will be run (U.B)

However, I wrote this piece of code which works in safe Rust and does not seem to suffer from undefined behavior: 但是,我编写了这段代码,它在安全的Rust中运行,并且似乎没有受到未定义的行为的影响:

#![feature(conservative_impl_trait)]

trait T {
    fn disp(&mut self);
}

struct A;
impl T for A {
    fn disp(&mut self) { println!("=== A ==="); }
}
impl Drop for A {
    fn drop(&mut self) { println!("Dropping A"); }
}

struct B;
impl T for B {
    fn disp(&mut self) { println!("=== B ==="); }
}
impl Drop for B {
    fn drop(&mut self) { println!("Dropping B"); }
}

fn foo() -> impl T { return A; }
fn bar() -> impl T { return B; }

fn main() {
    let mut a;
    let mut b;

    let i = 10;
    let t: &mut T = if i % 2 == 0 {
        a = foo();
        &mut a
    } else {
        b = bar();
        &mut b
    };

    t.disp();
    panic!("=== Test ===");
}

It always seems to execute the right destructor, while ignoring the other one. 似乎总是执行正确的析构函数,而忽略另一个析构函数。 If I tried using a or b (like a.disp() instead of t.disp() ) it correctly errors out saying I might be possibly using uninitialized memory. 如果我尝试使用ab (比如a.disp()而不是t.disp() ),它会正确地错误地说我可能正在使用未初始化的内存。 What surprised me is while panic king, it always runs the right destructor (printing the expected string) no matter what the value of i is. 让我感到惊讶的是, panic王,无论i的价值是什么,它总是运行正确的析构函数(打印预期的字符串)。

How does this happen? 这是怎么发生的? If the runtime can determine which destructor to run, should the part about memory mandatorily needing to be initialized for types with Drop implemented be removed from documentation of mem::uninitialized() as linked above? 如果运行时可以确定要运行哪个析构函数,那么是否应该从上面链接的mem::uninitialized()文档中Drop实现Drop类型的内存需要初始化的部分?

Using drop flags . 使用drop flags

Rust (up to and including version 1.12) stores a boolean flag in every value whose type implements Drop (and thus increases that type's size by one byte). Rust(包括版本1.12)在其类型实现Drop每个值中存储一个布尔标志(从而将该类型的大小增加一个字节)。 That flag decides whether to run the destructor. 该标志决定是否运行析构函数。 So when you do b = bar() it sets the flag for the b variable, and thus only runs b 's destructor. 因此,当你执行b = bar()它会设置b变量的标志,因此只运行b的析构函数。 Vice versa with a . 反之亦然a

Note that starting from Rust version 1.13 (at the time of this writing the beta compiler) that flag is not stored in the type, but on the stack for every variable or temporary. 请注意,从Rust版本1.13(在编写beta编译器时)开始,该标志不存储在类型中,而是存储在每个变量或临时的堆栈中。 This is made possible by the advent of the MIR in the Rust compiler. 这是通过Rust编译器中MIR的出现实现的。 The MIR significantly simplifies the translation of Rust code to machine code, and thus enabled this feature to move drop flags to the stack. MIR显着简化了Rust代码到机器代码的转换,从而使此功能能够将drop标记移动到堆栈。 Optimizations will usually eliminate that flag if they can figure out at compile time when which object will be dropped. 优化通常会消除该标志,如果他们能够在编译时弄清楚哪个对象将被删除。

You can "observe" this flag in a Rust compiler up to version 1.12 by looking at the size of the type: 通过查看类型的大小,您可以在Rust编译器中“观察”此标志,直到版本1.12:

struct A;

struct B;

impl Drop for B {
    fn drop(&mut self) {}
}

fn main() {
    println!("{}", std::mem::size_of::<A>());
    println!("{}", std::mem::size_of::<B>());
}

prints 0 and 1 respectively before stack flags, and 0 and 0 with stack flags. 打印01分别之前堆标志,和00与堆栈标志。

Using mem::uninitialized is still unsafe, however, because the compiler still sees the assignment to the a variable and sets the drop flag. 使用mem::uninitialized仍然是不安全的,但是,因为编译器仍然看到分配给a变量,并设置丢弃标志。 Thus the destructor will be called on uninitialized memory. 因此,析构函数将在未初始化的内存上调用。 Note that in your example the Drop impl does not access any memory of your type (except for the drop flag, but that is invisible to you). 请注意,在您的示例中, Drop impl不会访问您的类型的任何内存(drop flag除外,但对您来说是不可见的)。 Therefor you are not accessing the uninitialized memory (which is zero bytes in size anyway, since your type is a zero sized struct). 因此,您无法访问未初始化的内存(无论如何都是零字节,因为您的类型是零大小的结构)。 To the best of my knowledge that means that your unsafe { std::mem::uninitialized() } code is actually safe, because afterwards no memory unsafety can occur. 据我所知,这意味着您的unsafe { std::mem::uninitialized() }代码实际上是安全的,因为之后不会发生内存不安全。

There are two questions hidden here: 这里隐藏着两个问题:

  1. How does the compiler track which variable is initialized or not? 编译器如何跟踪哪个变量被初始化?
  2. Why may initializing with mem::uninitialized() lead to Undefined Behavior? 为什么用mem::uninitialized()会导致未定义的行为?

Let's tackle them in order. 让我们按顺序解决它们。


How does the compiler track which variable is initialized or not? 编译器如何跟踪哪个变量被初始化?

The compiler injects so-called "drop flags": for each variable for which Drop must run at the end of the scope, a boolean flag is injected on the stack, stating whether this variable needs to be disposed of. 编译器注入所谓的“丢弃标志”:对于Drop必须在作用域末尾运行的每个变量,在堆栈上注入一个布尔标志,说明是否需要处理该变量。

The flag starts off "no", moves to "yes" if the variable is initialized, and back to "no" if the variable is moved from. 标志从“no”开始,如果变量初始化则变为“yes”,如果变量被移动则变为“no”。

Finally, when comes the time to drop this variable, the flag is checked and it is dropped if necessary. 最后,当有时间删除此变量时,将检查该标志并在必要时将其删除。

This is unrelated as to whether the compiler's flow analysis complains about potentially uninitialized variables: only when the flow analysis is satisfied is code generated. 这与编译器的流分析是否抱怨可能未初始化的变量无关:只有在满足流分析时才生成代码。


Why may initializing with mem::uninitialized() lead to Undefined Behavior? 为什么用mem::uninitialized()会导致未定义的行为?

When using mem::uninitialized() you make a promise to the compiler: don't worry, I'm definitely initializing this . 当使用mem::uninitialized()你向编译器做出了承诺: 别担心,我肯定是在初始化它

As far as the compiler is concerned, the variable is therefore fully initialized, and the drop flag is set to "yes" (until you move out of it). 就编译器而言,变量因此被完全初始化,并且drop标志被设置为“yes”(直到你离开它为止)。

This, in turn, means that Drop will be called. 反过来,这意味着将调用Drop

Using an uninitialized object is Undefined Behavior, and the compiler calling Drop on an uninitialized object on your behalf counts as "using it". 使用未初始化的对象是未定义的行为,并且代表您在未初始化的对象上调用Drop的编译器计为“使用它”。


Bonus: 奖金:

In my tests, nothing weird happened! 在我的测试中,没有什么奇怪的事发生!

Note that Undefined Behavior means that anything can happen; 请注意,未定义的行为意味着任何事情都可能发生; anything, unfortunately, also includes "seems to work" (or even "works as intended despite the odds"). 不幸的是,任何事情都包括“似乎工作”(甚至“尽管有可能”按预期工作“)。

In particular, if you do NOT access the object's memory in Drop::drop (just printing), then it's very likely that everything will just work. 特别是,如果你不在Drop::drop (只是打印)中访问对象的内存,那么很可能一切都会正常工作。 If you do access it, however, you might see weird integers, pointers pointing into the wild, etc... 但是,如果你确实访问它,你可能会看到奇怪的整数,指向野外的指针等等......

And if the optimizer is clever, even without accessing it, it might do weird things! 如果优化器很聪明,即使没有访问它,也可能会做奇怪的事情! Since we are using LLVM, I invite you to read What every C programmer should know about Undefined Behavior by Chris Lattner (LLVM's father). 由于我们使用的是LLVM,因此我邀请您阅读Chris Lattner(LLVM的父亲) 应该了解的每个C程序员应该知道的Undefined Behavior

First, there are drop flags - runtime information for tracking which variables have been initialized. 首先,有drop标志 - 用于跟踪已初始化哪些变量的运行时信息。 If a variable was not assigned to, drop() will not be executed for it. 如果未分配变量,则不会为其执行drop()

In stable, the drop flag is currently stored within the type itself. 在stable中,drop标志当前存储在类型本身中。 Writing uninitialized memory to it can cause undefined behavior as to whether drop() will or will not be called. 将未初始化的内存写入它可能会导致未定义的行为是否将调用drop() This will soon be out of date information because the drop flag is moved out of the type itself in nightly. 这将很快就会过时,因为drop flag会在每晚从类型本身移出。

In nightly Rust, if you assign uninitialized memory to a variable, it would be safe to assume that drop() will be executed. 在每晚Rust中,如果将未初始化的内存分配给变量,则可以安全地假设将执行drop() However, any useful implementation of drop() will operate on the value. 但是, drop()任何有用实现都将对值进行操作。 There is no way to detect if the type is properly initialized or not within the Drop trait implementation: it could result in trying to free an invalid pointer or any other random thing, depending on the Drop implementation of the type. Drop trait实现中无法检测类型是否正确初始化:它可能导致尝试释放无效指针或任何其他随机事物,具体取决于类型的Drop实现。 Assigning uninitialized memory to a type with Drop is ill-advised anyway. 无论如何,将未初始化的内存分配给具有Drop的类型是不明智的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM