简体   繁体   English

我对支持 Rc 或 Box 包装类型的 Rust 向量的理解是否正确?

[英]Is my understanding of a Rust vector that supports Rc or Box wrapped types correct?

I'm not looking for code samples.我不是在寻找代码示例。 I want to state my understanding of Box vs. Rc and have you tell me if my understanding is right or wrong.我想 state 我对 Box 与 Rc 的理解,请你告诉我我的理解是对还是错。

Let's say I have some trait ChattyAnimal and a struct Cat that implements this trait, eg假设我有一些特征ChattyAnimal和一个实现此特征的结构Cat ,例如

pub trait ChattyAnimal {
  fn make_sound(&self);
}

pub struct Cat {
  pub name: String,
  pub sound: String
}

impl ChattyAnimal for Cat {
  fn make_sound(&self) {
    println!("Meow!");
  }
}

Now let's say I have other structs (Dog, Cow, Chicken, ...) that also implement the ChattyAnimal trait, and let's say I want to store all of these in the same vector.现在假设我有其他结构(狗、牛、鸡……)也实现了ChattyAnimal特征,假设我想将所有这些存储在同一个向量中。

So step 1 is I would have to use a Box type, because the Rust compiler cannot determine the size of everything that might implement this trait.所以第 1 步是我必须使用 Box 类型,因为 Rust 编译器无法确定可能实现此特征的所有内容的大小。 And therefore, we must store these items on the heap – viola using a Box type, which is like a smarter pointer in C++.因此,我们必须将这些项目存储在堆上——viola 使用Box类型,这就像 C++ 中的智能指针。 Anything wrapped with Box is automatically deleted by Rust when it goes out of scope.任何用 Box 包裹的东西都会在 scope 之外被 Rust 自动删除。

// I can alias and use my Box type that wraps the trait like this:
pub type BoxyChattyAnimal = Box<dyn ChattyAnimal>;

// and then I can use my type alias, i.e.
pub struct Container {
  animals: Vec<BoxyChattyAnimal>
}

Meanwhile, with Box , Rust's borrow checker requires changing when I pass or reassign the instance.同时,使用Box时,Rust 的借用检查器需要在我通过或重新分配实例时进行更改。 But if I actually want to have multiple references to the same underlying instance, I have to use Rc .但是如果我真的想对同一个底层实例有多个引用,我必须使用Rc And so to have a vector of ChattyAnimal instances where each instance can have multiple references, I would need to do:所以要有一个 ChattyAnimal 实例的向量,每个实例可以有多个引用,我需要这样做:

pub type RcChattyAnimal = Rc<dyn ChattyAnimal>;

pub struct Container {
  animals: Vec<RcChattyAnimal>
}

One important take away from this is that if I want to have a vector of some trait type, I need to explicitly set that vector's type to a Box or Rc that wraps my trait.一个重要的收获是,如果我想拥有某种特征类型的向量,我需要将该向量的类型显式设置为包装我的特征的 Box 或 Rc。 And so the Rust language designers force us to think about this in advance so that a Box or Rc cannot (at least not easily or accidentally) end up in the same vector.因此,Rust 语言设计者迫使我们提前考虑这一点,以便 Box 或 Rc 不能(至少不容易或意外地)最终出现在同一个向量中。

This feels like a very and well thought design – helping prevent me from introducing bugs in my code.这感觉像是一个经过深思熟虑的设计——帮助防止我在代码中引入错误。 Is my understanding as stated above correct?我的上述理解正确吗?

Yes, all this is correct.是的,这一切都是正确的。

There's a second reason for this design: it allows the compiler to verify that the operations you're performing on the vector elements are using memory in a safe way, relative to how they're stored.这种设计还有第二个原因:它允许编译器验证您对向量元素执行的操作是否以安全的方式使用 memory,相对于它们的存储方式。

For example, if you had a method on ChattyAnimal that mutates the animal (ie takes a &mut self argument), you could call that method on elements of a Vec<Box<dyn ChattyAnimal>> as long as you had a mutable reference to the vector;例如,如果您在ChattyAnimal上有一个使动物发生变异的方法(即采用&mut self参数),您可以在Vec<Box<dyn ChattyAnimal>>的元素上调用该方法,只要您对向量; the Rust compiler would know that there could only be one reference to the ChattyAnimal in question (because the only reference is inside the Box , which is inside the Vec , and you have a mutable reference to the Vec so there can't be any other references to it). Rust 编译器会知道只能有一个对有问题的ChattyAnimal的引用(因为唯一的引用在Box内,它在Vec内,并且您对Vec有一个可变引用,所以不能有任何其他引用对它的引用)。 If you tried to write the same code with a Vec<Rc<dyn ChattyAnimal>> , the compiler would complain;如果您尝试使用Vec<Rc<dyn ChattyAnimal>>编写相同的代码,编译器会抱怨; it wouldn't be able to completely eliminate the possibility that your code might be mutating the animal at the same time as the code that called it was in the middle of trying to read the animal, which might lead to some inconsistencies in the calling code.它无法完全消除您的代码可能在尝试读取动物的过程中同时改变动物的可能性,这可能会导致调用代码中的一些不一致.

As a consequence, the compiler needs to know that all the elements of the Vec have their memory treated in the same way, so that it can check to make sure that a reference to some arbitrary element of the Vec is being used appropriately.因此,编译器需要知道Vec的所有元素都以相同的方式处理其 memory,以便它可以检查以确保对Vec的某些任意元素的引用被正确使用。

(There's a third reason, too, which is performance; because the compiler knows that this is a "vector of Box es" or "vector of Rc s", it can generate code that assumes a particular storage mechanism. For example, if you have a vector of Rc s, and clone one of the elements, the machine code that the compiler generates will work simply by going to the memory address listed in the vector and adding 1 to the reference count stored there – there's no need for any extra levels of indirection. If the vector were allowed to mix different allocation schemes, the generated code would have to be a lot more complex, because it wouldn't be able to assume things like "there is a reference count", and would instead need to (at runtime) find the appropriate piece of code for dealing with the memory allocation scheme in use, and then run it; that would be much slower.) (还有第三个原因,即性能;因为编译器知道这是“ Box es 的向量”或“ Rc s 的向量”,它可以生成假定特定存储机制的代码。例如,如果您有一个Rc的向量,并clone其中一个元素,编译器生成的机器代码只需转到向量中列出的 memory 地址并将存储在那里的引用计数加 1 即可工作 - 不需要任何额外间接级别。如果允许向量混合不同的分配方案,则生成的代码必须要复杂得多,因为它无法假设“有引用计数”之类的东西,而是需要(在运行时)找到适当的代码来处理正在使用的 memory 分配方案,然后运行它;那会慢得多。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM