简体   繁体   English

rust 如何有多个可变引用分配给 object 的堆栈?

[英]rust how to have multiple mutable references to a stack allocated object?

Let us say we have this C code:假设我们有这个 C 代码:

typedef struct A { int i; } A;
typedef struct B { A* a; } B;
typedef struct C { A* a; } C;

int main(void)
{
  A a = { .i = 42 };
  B b = { .a = &a };
  C c = { .a = &a };
}

In this scenario A is stack allocated, B and C point to the stack allocated memory where A lives.在这种情况下,A 是堆栈分配的,B 和 C 指向 A 所在的堆栈分配 memory。

I need to do exactly the same thing in rust but every time I try to create mutliple mutable references it complaints.我需要在 rust 中做完全相同的事情,但每次我尝试创建多个可变引用时都会抱怨。

It's a little frustrating having to fight the language to accomplish something so basic.不得不与语言对抗以完成如此基本的事情有点令人沮丧。

This may not apply to your specific situation as indicated in the comments, but the general way to solve this in Rust is with a RefCell .这可能不适用于您在评论中指出的具体情况,但在 Rust 中解决此问题的一般方法是使用RefCell This type allows you to obtain a &mut T from a &RefCell<T> .此类型允许您从&RefCell<T>获取&mut T For example:例如:

use std::cell::RefCell;

struct A(pub RefCell<i32>);
struct B<'a>(pub &'a RefCell<i32>);

fn main() {
    let a = A(RefCell::new(0));
    let b = B(&a.0);
    let c = B(&a.0);
    
    *b.0.borrow_mut() = 1;
    println!("{}", c.0.borrow());
    
    *c.0.borrow_mut() = 2;
    println!("{}", b.0.borrow());
}

Note that RefCell has overhead in the form of a borrow count, required to enforce Rust's aliasing rules at runtime (it will panic if a mutable borrow exists concurrently with any other borrow).请注意, RefCell以借用计数的形式存在开销,这是在运行时强制执行 Rust 的别名规则所必需的(如果可变借用与任何其他借用同时存在,它将出现恐慌)。

If the underlying type is Copy and you don't need references to the inner value then you can use Cell , which does not have any runtime overhead, as each operation fully retrieves or replaces the contained value:如果底层类型是Copy并且您不需要对内部值的引用,那么您可以使用Cell ,它没有任何运行时开销,因为每个操作都会完全检索或替换包含的值:

use std::cell::Cell;

struct A(pub Cell<i32>);
struct B<'a>(pub &'a Cell<i32>);

fn main() {
    let a = A(Cell::new(0));
    let b = B(&a.0);
    let c = B(&a.0);
    
    b.0.set(1);
    println!("{}", c.0.get());
    
    c.0.set(2);
    println!("{}", b.0.get());
}

Note that Cell is #[repr(transparent)] which is particularly useful in systems programming as it permits some types of zero-cost transmuting between different types.请注意, Cell#[repr(transparent)] ,它在系统编程中特别有用,因为它允许在不同类型之间进行某些类型的零成本转换。

It's a little frustrating having to fight the language to accomplish something so basic.不得不与语言对抗以完成如此基本的事情有点令人沮丧。

It's not as basic as you think.它不像你想象的那么基本。 Rust's main premise is to have zero undefined behaviour, and it's almost impossible to have two mutable references simultaneously while upholding that guarantee. Rust 的主要前提是零未定义行为,并且几乎不可能同时拥有两个可变引用同时维护该保证。 How would you make sure that through the means of multithreading you don't accidentally get a race condition?您将如何确保通过多线程不会意外获得竞争条件? This already is undefined behaviour that might be exploitable for malicious means.这已经是可用于恶意手段的未定义行为。

Learning Rust is not easy, and it is especially hard if you come from a different language, as many programming paradigms simply don't work in Rust.学习 Rust 并不容易,如果您来自不同的语言,则尤其困难,因为许多编程范例根本不适用于 Rust。 But I can assure you that once you understand how to structure code differently, it will actually become a positive thing, because Rust forces programmers to distance themselves from questionable patterns, or patterns that seem fine but need a second glance to understand what is actually wrong with them.但我可以向你保证,一旦你了解了如何以不同的方式构造代码,它实际上将成为一件好事,因为 Rust 迫使程序员远离有问题的模式,或者看起来不错但需要再看一眼才能理解实际错误的模式跟他们。 C/C++ bugs are usually very subtle and caused by some weird cornercase, and after programming in Rust for a while, it is incredibly rewarding to have the assurance that those corner cases simply don't exist. C/C++ 错误通常是非常微妙的,并且是由一些奇怪的极端情况引起的,在 Rust 中编程一段时间后,确保这些极端情况根本不存在是非常值得的。

But back to your problem.但是回到你的问题。

There are two language concepts here that need to be combined to achieve what you are trying to do.这里有两个语言概念需要结合起来才能实现你想要做的事情。

For once, the borrow checker forces you to have only one mutable reference to a specific piece data at once.这一次,借用检查器强制您一次只有一个对特定片段数据的可变引用。 That means, if you definitely want to modify it from multiple places, you will have to utilize a concept calledinterior mutability .这意味着,如果您确实想从多个地方修改它,您将不得不利用一个称为内部可变性的概念。 Depending on your usecase, there are several ways to create interior mutability:根据您的用例,有几种方法可以创建内部可变性:

  • Cell - single-threaded, for primitive types that can be replaced by being copied. Cell - 单线程,用于可以通过复制来替换的原始类型。 This is a zero-cost abstraction.这是一个零成本的抽象。
  • RefCell - single-threaded, for more complex types that require a mutable reference instead of being updatable by replacement.RefCell - 单线程,用于需要可变引用而不是通过替换来更新的更复杂的类型。 Minimal overhead to check if it is already borrowed.检查它是否已被借用的最小开销。
  • Atomic - multi-threaded, for primitive types. Atomic - 多线程,用于原始类型。 In most cases zero-cost abstractions (on x86-64 everything up to u64/i64 is already atomic out of the box, zero overhead required)在大多数情况下,零成本抽象(在 x86-64 上,直到 u64/i64 的所有内容都已经是开箱即用的原子,需要零开销)
  • Mutex - like RefCell , but for multiple threads. Mutex - 类似RefCell ,但用于多个线程。 Larger overhead due to active internal lock management.由于活动的内部锁管理,开销更大。

So depending on your usecase, you need to choose the right one.因此,根据您的用例,您需要选择正确的用例。 In your case, if your data is really an int , I'd go with a Cell or an Atomic .在您的情况下,如果您的数据确实是int ,我会使用 go 和CellAtomic

Second, there is the problem of how to get multiple (immutable) references to your object in the first place.其次,首先存在如何获取对 object 的多个(不可变)引用的问题。

Right away, I would like to tell you: Do not use raw pointers prematurely.马上,我想告诉你:不要过早地使用原始指针。 Raw pointers and unsafe bypass the borrow checker and make Rust as a language pointless.原始指针和unsafe绕过借用检查器并使 Rust 作为语言毫无意义。 99.9% of problems work great and performant without using raw pointers, so only use them in circumstances where absolutely no alternative exists. 99.9% 的问题在不使用原始指针的情况下运行良好且性能良好,因此仅在绝对没有替代方案存在的情况下使用它们。

That said, there are three general ways to share data:也就是说,共享数据的一般方法有以下三种:

  • &A - Normal reference. &A - 正常参考。 While the reference exists, the referenced object cannot be moved or deleted.虽然引用存在,但引用的 object 无法移动或删除。 So this is probably not what you want.所以这可能不是你想要的。
  • Rc<A> - Single threaded reference counter. Rc<A> - 单线程引用计数器。 Very lightweight, so don't worry about overhead.非常轻巧,所以不用担心开销。 Accessing the data is a zero-cost abstraction, additional cost only arises when you copy/delete the actual Rc object.访问数据是零成本抽象,仅当您复制/删除实际的Rc object 时才会产生额外成本。 Moving the Rc object should theoretically be free as this doesn't change the reference count.理论上移动Rc object应该是免费的,因为这不会改变引用计数。
  • Arc<A> - Multi threaded reference counter. Arc<A> - 多线程引用计数器。 Like Rc , the actual access is zero-cost, but the cost of copying/deleting the Arc object itself is minimally higher than Rc .Rc一样,实际访问是零成本,但复制/删除Arc object 本身的成本比Rc高一点。 Moving the Arc object should theoretically be free as this doesn't change the reference count.移动Arc object 理论上应该是免费的,因为这不会改变引用计数。

So assuming that you have a single threaded program and the problem is exactly as you layed it out, I'd do:因此,假设您有一个单线程程序并且问题与您提出的完全一样,我会这样做:

use std::{cell::Cell, rc::Rc};

struct A {
    i: Cell<i32>,
}
struct B {
    a: Rc<A>,
}
struct C {
    a: Rc<A>,
}

fn main() {
    let a = Rc::new(A { i: Cell::new(42) });
    let b = B { a: Rc::clone(&a) };
    let c = C { a: Rc::clone(&a) };

    b.a.i.set(69);
    c.a.i.set(c.a.i.get() + 2);
    println!("{}", a.i.get());
}
71

But of course all the other combinations, like Rc + Atomic , Arc + Atomic , Arc + Mutex etc are also viable.但当然所有其他组合,如Rc + AtomicArc + AtomicArc + Mutex等也是可行的。 It depends on your usecase.这取决于您的用例。

If your b and c objects provably live shorter than a (meaning, if they only exist for a couple of lines of code and don't get moved anywhere else) then of course use a reference instead of Rc .如果您的bc对象的寿命可证明比a短(意思是,如果它们仅存在于几行代码并且不会移动到其他任何地方),那么当然使用引用而不是Rc The biggest performance difference between Rc and a direct reference is that the object inside of an Rc lives on the heap, not on the stack, so it's the equivalent of calling new / delete once in C++. Rc和直接引用之间最大的性能差异是Rc内部的 object 存在于堆上,而不是栈上,因此它相当于在 C++ 中调用一次new / delete

So, for reference, if your data sharing allows the object to live on the stack, like in our example, then the code would look like this:因此,作为参考,如果您的数据共享允许 object 存在于堆栈中,就像在我们的示例中一样,那么代码将如下所示:

use std::cell::Cell;

struct A {
    i: Cell<i32>,
}
struct B<'a> {
    a: &'a A,
}
struct C<'a> {
    a: &'a A,
}

fn main() {
    let a = A { i: Cell::new(42) };
    let b = B { a: &a };
    let c = C { a: &a };

    b.a.i.set(69);
    c.a.i.set(c.a.i.get() + 2);
    println!("{}", a.i.get());
}
71

Note that the reference is zero-cost and Cell is also zero-cost, so this code will perform 100% identical to as if you had used raw pointers;请注意,引用是零成本的,而Cell也是零成本的,因此此代码将执行 100% 的操作,就像您使用原始指针一样; with the difference that the borrow checker can now prove that this will not cause undefined behaviour.不同的是借用检查器现在可以证明这不会导致未定义的行为。

To demonstrate just how zero-cost this is, look at the assembly output of the example above .为了证明这是多么的零成本,请查看上面示例的组件 output The compiler managed to optimize the entire code to:编译器设法将整个代码优化为:

fn main() {
    println!("{}", 71);
}

Be aware that in your C example, nothing would prevent you from copying the b object somewhere else while a gets out of scope and gets destroyed.请注意,在您的 C 示例中,没有什么可以阻止您将b object 复制到其他地方,而a离开 scope 并被破坏。 This would cause undefined behavior and will get prevented by the borrow checker in Rust, which is the reason the structs B and C carry the lifetime 'a to track the fact that they borrow an A .这将导致未定义的行为,并会被 Rust 中的借用检查器阻止,这就是结构BC携带生命周期'a以跟踪它们借用A的事实的原因。

Lastly, I would like to talk about unsafe code.最后,我想谈谈unsafe的代码。 Yes, if you interface with C or write low-level drivers that require direct memory access, unsafe is absolutely important to have.是的,如果您与C接口或编写需要直接访问 memory 的低级驱动程序,那么unsafe是绝对重要的。 That said, it's important that you understand how to deal with unsafe in order to still keep up Rusts safety guarantees.也就是说,了解如何处理unsafe以保持 Rust 的安全保证是很重要的。 Otherwise, there really isn't any point in using Rust.否则,使用 Rust 真的没有任何意义。 Don't just use unsafe out of convenience to simply overrule the borrow checker, but instead make sure that the resulting unsafe usage is sound .不要只是为了方便而使用unsafe来简单地否决借用检查器,而是要确保由此产生的unsafe使用是合理的。 Please read this article about soundness before you use the unsafe keyword.在使用unsafe关键字之前,请阅读这篇关于健全性的文章。

I hope this managed to get you a glimpse of what kind of thinking is required for programming in Rust, and hope it did not intimidate you too much.我希望这能让您了解在 Rust 中编程需要什么样的思维,并希望它没有吓到您太多。 Give it a chance;给它一个机会; while having a fairly steep learning curve, especially for programmers with strong prior knowledge in other languages, it can be very rewarding.虽然学习曲线相当陡峭,特别是对于具有丰富其他语言先验知识的程序员来说,但它可能是非常有益的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM