简体   繁体   English

为什么 Rust 需要明确的生命周期?

[英]Why are explicit lifetimes needed in Rust?

I was reading the lifetimes chapter of the Rust book, and I came across this example for a named/explicit lifetime:我正在阅读 Rust 一书的生命周期章节,我遇到了这个命名/显式生命周期的例子:

struct Foo<'a> {
    x: &'a i32,
}

fn main() {
    let x;                    // -+ x goes into scope
                              //  |
    {                         //  |
        let y = &5;           // ---+ y goes into scope
        let f = Foo { x: y }; // ---+ f goes into scope
        x = &f.x;             //  | | error here
    }                         // ---+ f and y go out of scope
                              //  |
    println!("{}", x);        //  |
}                             // -+ x goes out of scope

It's quite clear to me that the error being prevented by the compiler is the use-after-free of the reference assigned to x : after the inner scope is done, f and therefore &f.x become invalid, and should not have been assigned to x .我很清楚编译器阻止的错误是分配给x的引用的释放后使用:在内部 scope 完成后, f&f.x变得无效,并且不应该分配给x

My issue is that the problem could have easily been analyzed away without using the explicit 'a lifetime, for instance by inferring an illegal assignment of a reference to a wider scope ( x = &f.x; ).我的问题是,如果不使用显式'a lifetime,这个问题可以很容易地被分析掉,例如通过推断对更广泛的 scope ( x = &f.x; ) 的引用的非法分配。

In which cases are explicit lifetimes actually needed to prevent use-after-free (or some other class?) errors?在哪些情况下实际上需要明确的生命周期来防止释放后使用(或其他一些 class?)错误?

The other answers all have salient points ( fjh's concrete example where an explicit lifetime is needed ), but are missing one key thing: why are explicit lifetimes needed when the compiler will tell you you've got them wrong ?其他答案都有重点( fjh 的具体示例,其中需要显式生命周期),但缺少一个关键点:当编译器会告诉您错误时,为什么需要显式生命周期?

This is actually the same question as "why are explicit types needed when the compiler can infer them".这实际上与“为什么编译器可以推断出显式类型时需要显式类型”是同一个问题。 A hypothetical example:一个假设的例子:

fn foo() -> _ {  
    ""
}

Of course, the compiler can see that I'm returning a &'static str , so why does the programmer have to type it?当然,编译器可以看到我返回了一个&'static str ,那么程序员为什么要输入它呢?

The main reason is that while the compiler can see what your code does, it doesn't know what your intent was.主要原因是虽然编译器可以看到你的代码做了什么,但它不知道你的意图是什么。

Functions are a natural boundary to firewall the effects of changing code.函数是阻止更改代码的影响的自然边界。 If we were to allow lifetimes to be completely inspected from the code, then an innocent-looking change might affect the lifetimes, which could then cause errors in a function far away.如果我们允许从代码中完全检查生命周期,那么看似无辜的更改可能会影响生命周期,这可能会导致远处的函数出现错误。 This isn't a hypothetical example.这不是一个假设的例子。 As I understand it, Haskell has this problem when you rely on type inference for top-level functions.据我了解,当您依赖顶级函数的类型推断时,Haskell 会遇到这个问题。 Rust nipped that particular problem in the bud. Rust 将这个特殊问题扼杀在了萌​​芽状态。

There is also an efficiency benefit to the compiler — only function signatures need to be parsed in order to verify types and lifetimes.编译器还有一个效率优势——只需要解析函数签名来验证类型和生命周期。 More importantly, it has an efficiency benefit for the programmer.更重要的是,它对程序员有效率优势。 If we didn't have explicit lifetimes, what does this function do:如果我们没有明确的生命周期,这个函数会做什么:

fn foo(a: &u8, b: &u8) -> &u8

It's impossible to tell without inspecting the source, which would go against a huge number of coding best practices.如果不检查源代码就无法判断,这将违背大量的编码最佳实践。

by inferring an illegal assignment of a reference to a wider scope通过推断对更广泛范围的引用的非法分配

Scopes are lifetimes, essentially.范围本质上生命周期。 A bit more clearly, a lifetime 'a is a generic lifetime parameter that can be specialized with a specific scope at compile time, based on the call site.更清楚一点,生命周期'a是一个通用生命周期参数,可以在编译时根据调用站点专用于特定范围。

are explicit lifetimes actually needed to prevent [...] errors?是否确实需要显式生命周期来防止 [...] 错误?

Not at all.一点也不。 Lifetimes are needed to prevent errors, but explicit lifetimes are needed to protect what little sanity programmers have.需要生命周期来防止错误,但需要显式生命周期来保护那些头脑清醒的程序员所拥有的东西。

Let's have a look at the following example.让我们看看下面的例子。

fn foo<'a, 'b>(x: &'a u32, y: &'b u32) -> &'a u32 {
    x
}

fn main() {
    let x = 12;
    let z: &u32 = {
        let y = 42;
        foo(&x, &y)
    };
}

The lifetime annotation in the following structure:以下结构中的生命周期注释:

struct Foo<'a> {
    x: &'a i32,
}

Note that there are no explicit lifetimes in that piece of code, except the structure definition.请注意,除了结构定义之外,该代码段中没有明确的生命周期。 The compiler is perfectly able to infer lifetimes in main() .编译器完全能够推断main()中的生命周期。

In type definitions, however, explicit lifetimes are unavoidable.然而,在类型定义中,显式的生命周期是不可避免的。 For example, there is an ambiguity here:例如,这里有一个歧义:

struct RefPair(&u32, &u32);

Should these be different lifetimes or should they be the same?这些应该是不同的生命周期还是应该相同? It does matter from the usage perspective, struct RefPair<'a, 'b>(&'a u32, &'b u32) is very different from struct RefPair<'a>(&'a u32, &'a u32) .从使用的角度来看, struct RefPair<'a, 'b>(&'a u32, &'b u32)struct RefPair<'a>(&'a u32, &'a u32) , &'a u32) 非常不同。

Now, for simple cases, like the one you provided, the compiler could theoretically elide lifetimes like it does in other places, but such cases are very limited and do not worth extra complexity in the compiler, and this gain in clarity would be at the very least questionable.现在,对于简单的情况,例如您提供的情况,编译器理论上可以像在其他地方一样省略生命周期,但是这种情况非常有限,不值得在编译器中增加额外的复杂性,而这种清晰的增益将在最不值得怀疑的。

If a function receives two references as arguments and returns a reference, then the implementation of the function might sometimes return the first reference and sometimes the second one.如果一个函数接收两个引用作为参数并返回一个引用,那么函数的实现有时可能返回第一个引用,有时返回第二个引用。 It is impossible to predict which reference will be returned for a given call.无法预测给定调用将返回哪个引用。 In this case, it is impossible to infer a lifetime for the returned reference, since each argument reference may refer to a different variable binding with a different lifetime.在这种情况下,不可能为返回的引用推断生命周期,因为每个参数引用可能引用具有不同生命周期的不同变量绑定。 Explicit lifetimes help to avoid or clarify such a situation.显式生命周期有助于避免或澄清这种情况。

Likewise, if a structure holds two references (as two member fields) then a member function of the structure may sometimes return the first reference and sometimes the second one.同样,如果一个结构包含两个引用(作为两个成员字段),则结构的成员函数有时可能返回第一个引用,有时返回第二个引用。 Again explicit lifetimes prevent such ambiguities.明确的生命周期再次防止了这种歧义。

In a few simple situations, there is lifetime elision<\/em><\/a> where the compiler can infer lifetimes.在一些简单的情况下,编译器可以推断生命周期存在生命周期省略<\/em><\/a>。

"

The case from the book is very simple by design.书中的案例设计非常简单。 The topic of lifetimes is deemed complex.生命的主题被认为是复杂的。

The compiler cannot easily infer the lifetime in a function with multiple arguments.编译器无法轻松推断具有多个参数的函数的生命周期。

Also, my own optional<\/a> crate has an OptionBool<\/code> type with an as_slice<\/code> method whose signature actually is:此外,我自己的可选<\/a>crate 有一个带有as_slice<\/code>方法的OptionBool<\/code>类型,其签名实际上是:

fn as_slice(&self) -> &'static [bool] { ... }

The reason why your example does not work is simply because Rust only has local lifetime and type inference.您的示例不起作用的原因仅仅是因为 Rust 只有本地生命周期和类型推断。 What you are suggesting demands global inference.您的建议需要全局推理。 Whenever you have a reference whose lifetime cannot be elided, it must be annotated.每当您有一个生命周期不能省略的引用时,都必须对其进行注释。

"

As a newcomer to Rust, my understanding is that explicit lifetimes serve two purposes.作为 Rust 的新手,我的理解是显式生命周期有两个目的。

  1. Putting an explicit lifetime annotation on a function restricts the type of code that may appear inside that function.在函数上放置显式生命周期注释会限制可能出现在该函数内的代码类型。 Explicit lifetimes allow the compiler to ensure that your program is doing what you intended.显式生命周期允许编译器确保您的程序正在执行您想要的操作。

  2. If you (the compiler) want(s) to check if a piece of code is valid, you (the compiler) will not have to iteratively look inside every function called.如果您(编译器)想要检查一段代码是否有效,您(编译器)将不必迭代地查看每个调用的函数。 It suffices to have a look at the annotations of functions that are directly called by that piece of code.看一下由那段代码直接调用的函数的注释就足够了。 This makes your program much easier to reason about for you (the compiler), and makes compile times managable.这使您的程序更容易为您(编译器)推理,并使编译时间易于管理。

On point 1., Consider the following program written in Python:关于第 1 点,考虑以下用 Python 编写的程序:

import pandas as pd
import numpy as np

def second_row(ar):
    return ar[0]

def work(second):
    df = pd.DataFrame(data=second)
    df.loc[0, 0] = 1

def main():
    # .. load data ..
    ar = np.array([[0, 0], [0, 0]])

    # .. do some work on second row ..
    second = second_row(ar)
    work(second)

    # .. much later ..
    print(repr(ar))

if __name__=="__main__":
    main()

which will print这将打印

array([[1, 0],
       [0, 0]])

This type of behaviour always surprises me.这种行为总是让我感到惊讶。 What is happening is that df is sharing memory with ar , so when some of the content of df changes in work , that change infects ar as well.发生的事情是dfar共享内存,所以当df的某些内容在work发生变化时,这种变化也会感染ar However, in some cases this may be exactly what you want, for memory efficiency reasons (no copy).但是,在某些情况下,出于内存效率的原因(无副本),这可能正是您想要的。 The real problem in this code is that the function second_row is returning the first row instead of the second;这段代码的真正问题是函数second_row返回的是第一行而不是第二行; good luck debugging that.祝你调试好运。

Consider instead a similar program written in Rust:考虑一个用 Rust 编写的类似程序:

#[derive(Debug)]
struct Array<'a, 'b>(&'a mut [i32], &'b mut [i32]);

impl<'a, 'b> Array<'a, 'b> {
    fn second_row(&mut self) -> &mut &'b mut [i32] {
        &mut self.0
    }
}

fn work(second: &mut [i32]) {
    second[0] = 1;
}

fn main() {
    // .. load data ..
    let ar1 = &mut [0, 0][..];
    let ar2 = &mut [0, 0][..];
    let mut ar = Array(ar1, ar2);

    // .. do some work on second row ..
    {
        let second = ar.second_row();
        work(second);
    }

    // .. much later ..
    println!("{:?}", ar);
}

Compiling this, you get编译这个,你得到

error[E0308]: mismatched types
 --> src/main.rs:6:13
  |
6 |             &mut self.0
  |             ^^^^^^^^^^^ lifetime mismatch
  |
  = note: expected type `&mut &'b mut [i32]`
             found type `&mut &'a mut [i32]`
note: the lifetime 'b as defined on the impl at 4:5...
 --> src/main.rs:4:5
  |
4 |     impl<'a, 'b> Array<'a, 'b> {
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^
note: ...does not necessarily outlive the lifetime 'a as defined on the impl at 4:5
 --> src/main.rs:4:5
  |
4 |     impl<'a, 'b> Array<'a, 'b> {
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^

In fact you get two errors, there is also one with the roles of 'a and 'b interchanged.实际上你得到了两个错误,还有一个是'a'b的角色互换了。 Looking at the annotation of second_row , we find that the output should be &mut &'b mut [i32] , ie, the output is supposed to be a reference to a reference with lifetime 'b (the lifetime of the second row of Array ).查看second_row的注解,我们发现输出应该是&mut &'b mut [i32] ,即输出应该是对生命周期为'b的引用的引用( Array第二行的生命周期) . However, because we are returning the first row (which has lifetime 'a ), the compiler complains about lifetime mismatch.但是,因为我们要返回第一行(它的生命周期'a ),所以编译器会抱怨生命周期不匹配。 At the right place.在正确的地方。 At the right time.在正确的时间。 Debugging is a breeze.调试轻而易举。

I think of a lifetime annotation as a contract about a given ref been valid in the receiving scope only while it remains valid in the source scope.我认为生命周期注释是关于给定 ref 的合同,仅在接收范围内有效,而在源范围内仍然有效。 Declaring more references in the same lifetime kind of merges the scopes, meaning that all the source refs have to satisfy this contract.在同一生命周期中声明更多引用会合并范围,这意味着所有源引用都必须满足此合同。 Such annotation allow the compiler to check for the fulfillment of the contract.此类注释允许编译器检查合同的履行情况。

Simplified Answer:简化答案:

Example-例子-

fn main() {
    let string1 = String::from("abcd");
    let string2 = "xyz";

    let result = longest(string1.as_str(), string2);
    println!("The longest string is {}", result);
}

fn longest(x: &str, y: &str) -> &str {  // <-- ERROR
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Suppose that function call is call by customer , and function as the seller假设function call客户呼叫function作为卖方

In snippet ,在片段中,
Then, function call expect that it will get one of the value, passed in argument, in return (as in snippet one)然后, function call期望它将获得一个值,传入参数,作为回报(如片段一)
But, if seller is biased or accidently give value other than parameters.但是,如果卖家有偏见或不小心给出了参数以外的价值。 like -喜欢 -

fn longest(x: &str, y: &str) -> &str {
    let z = "Other String";
    &z
}

Then, both function call and function both will get error message然后, function callfunction都会收到错误消息
But, their is no any mistake of customer.但是,他们没有任何客户的错误。
Therefore, Rust ensure that customer will not get any error, for the mistake of seller, with the help of annotating lifetime parameter.因此,Rust 确保客户不会得到任何错误,对于卖家的错误,在注释生命周期参数的帮助下。

This is also the reason of, "Why Typescript introduced in Javascript".这也是“为什么在Javascript中引入Typescript”的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 延长生锈的借用寿命 - Extending borrow lifetimes in rust 在 Rust 中为静态生命周期的悬空引用和建议而苦苦挣扎 - Struggling with dangling references and suggestions for static lifetimes in Rust Rust 是否会缩短生命周期以满足对其定义的约束? - Does Rust narrow lifetimes to satisfy constraints defined on them? 在Rust中,如何将两个对象的生命周期明确地绑定在一起,而不是相互引用? - In Rust, how do you explicitly tie the lifetimes of two objects together, without referencing eachother? 我使用什么生命周期来创建循环引用彼此的 Rust 结构? - What lifetimes do I use to create Rust structs that reference each other cyclically? 为什么链接生命周期仅与可变引用有关? - Why does linking lifetimes matter only with mutable references? 为什么运算符重载需要引用? - Why is a reference needed in operator overloading? 在Vec中混合引用的生存期 - Mix lifetimes of references in Vec 为什么在某些情况下不需要对象实例化? - Why object instantiation is not needed in some cases such as this? 为什么在创建结构时需要地址运算符(&amp;)? - Why is address operator (&) needed when creating struct?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM