简体   繁体   English

具有内部可变性的单元格允许任意突变动作

[英]A cell with interior mutability allowing arbitrary mutation actions

Standard Cell struct provides interior mutability but allows only a few mutation methods such as set(), swap() and replace().标准Cell结构提供内部可变性,但只允许一些突变方法,例如 set()、swap() 和 replace()。 All of these methods change the whole content of the Cell.所有这些方法都会改变 Cell 的全部内容。 However, sometimes more specific manipulations are needed, for example, to change only a part of data contained in the Cell.但是,有时需要进行更具体的操作,例如,仅更改单元格中包含的部分数据。

So I tried to implement some kind of universal Cell, allowing arbitrary data manipulation.所以我尝试实现某种通用单元格,允许任意数据操作。 The manipulation is represented by user-defined closure that accepts a single argument - &mut reference to the interior data of the Cell, so the user itself can deside what to do with the Cell interior.操作由用户定义的闭包表示,该闭包接受单个参数 - &mut 对 Cell 内部数据的引用,因此用户自己可以决定如何处理 Cell 内部。 The code below demonstrates the idea:下面的代码演示了这个想法:

use std::cell::UnsafeCell;

struct MtCell<Data>{
    dcell: UnsafeCell<Data>,
}

impl<Data> MtCell<Data>{
    fn new(d: Data) -> MtCell<Data> {
        return MtCell{dcell: UnsafeCell::new(d)};
    }

    fn exec<F, RetType>(&self, func: F) -> RetType where
        RetType: Copy,
        F: Fn(&mut Data) -> RetType 
    {
        let p = self.dcell.get();
        let pd: &mut Data;
        unsafe{ pd = &mut *p; }
        return func(pd);
    }
}

// test:

type MyCell = MtCell<usize>;

fn main(){
    let c: MyCell = MyCell::new(5);
    println!("initial state: {}", c.exec(|pd| {return *pd;}));
    println!("state changed to {}", c.exec(|pd| {
        *pd += 10; // modify the interior "in place"
       return *pd;
    }));
}

However, I have some concerns regarding the code.但是,我对代码有些担心。

  1. Is it safe, ie can some safe but malicious closure break Rust mutability/borrowing/lifetime rules by using this "universal" cell?它是否安全,即一些安全但恶意的关闭是否可以通过使用这个“通用”单元格来打破 Rust 可变性/借用/生命周期规则? I consider it safe since lifetime of the interior reference parameter prohibits its exposition beyond the closure call time.我认为它是安全的,因为内部引用参数的生命周期禁止其在闭包调用时间之外的公开。 But I still have doubts (I'm new to Rust).但我仍然有疑问(我是 Rust 的新手)。

  2. Maybe I'm re-inventing the wheel and there exist some templates or techniques solving the problem?也许我正在重新发明轮子,并且存在一些解决问题的模板或技术?

Note: I posted the question here (not on code review) as it seems more related to the language rather than code itself (which represents just a concept).注意:我在这里发布了这个问题(不是在代码审查中),因为它似乎与语言而不是代码本身(仅代表一个概念)更相关。

[EDIT] I'd want zero cost abstraction without possibility of runtime failures, so RefCell is not perfect solution. [编辑] 我希望零成本抽象没有运行时失败的可能性,所以 RefCell 不是完美的解决方案。

This is a very common pitfall for Rust beginners.这是 Rust 初学者常见的陷阱。

  1. Is it safe, ie can some safe but malicious closure break Rust mutability/borrowing/lifetime rules by using this "universal" cell?它是否安全,即一些安全但恶意的关闭是否可以通过使用这个“通用”单元格来打破 Rust 可变性/借用/生命周期规则? I consider it safe since lifetime of the interior reference parameter prohibits its exposition beyond the closure call time.我认为它是安全的,因为内部引用参数的生命周期禁止其在闭包调用时间之外的公开。 But I still have doubts (I'm new to Rust).但我仍然有疑问(我是 Rust 的新手)。

In a word, no.一句话,没有。

Playground 操场

fn main() {
    let mt_cell = MtCell::new(123i8);

    mt_cell.exec(|ref1: &mut i8| {
        mt_cell.exec(|ref2: &mut i8| {
            println!("Double mutable ref!: {:?} {:?}", ref1, ref2);
        })
    })
}

You're absolutely right that the reference cannot be used outside of the closure, but inside the closure, all bets are off, In fact, pretty much any operation (read or write) on the cell within the closure is undefined behavior (UB).你是绝对正确的,引用不能在闭包之外使用,但是在闭包内部,所有的赌注都是关闭的,事实上,几乎任何对闭包内单元格的操作(读取或写入)都是未定义的行为(UB) . and may cause corruption/crashes anywhere in your program.并可能导致程序中任何地方的损坏/崩溃。

  1. Maybe I'm re-inventing the wheel and there exist some templates or techniques solving the problem?也许我正在重新发明轮子,并且存在一些解决问题的模板或技术?

Using Cell is often not the best technique, but it's impossible to know what the best solution is without knowing more about the problem.使用Cell通常不是最好的技术,但是如果不了解问题的更多信息,就不可能知道最好的解决方案是什么。

If you insist on Cell , there are safe ways to do this.如果您坚持使用Cell ,则有一些安全的方法可以做到这一点。 The unstable (ie. beta) Cell::update() method is literally implemented with the following code (when T: Copy ):不稳定的(即测试版) Cell::update()方法实际上是使用以下代码实现的(当T: Copy时):

pub fn update<F>(&self, f: F) -> T
where
    F: FnOnce(T) -> T,
{
    let old = self.get();
    let new = f(old);
    self.set(new);
    new
}

Or you could use Cell::get_mut() , but I guess that defeats the whole purpose of Cell .或者您可以使用Cell::get_mut() ,但我想这违背了Cell的全部目的。

However, usually the best way to change only part of a Cell is by breaking it up into separate Cell s.但是,通常仅更改部分Cell的最佳方法是将其分解为单独的Cell For example, instead of Cell<(i8, i8, i8)> , use (Cell<i8>, Cell<i8>, Cell<i8>) .例如,使用(Cell<i8>, Cell<i8>, Cell<i8>)代替Cell<(i8, i8, i8)>

Still, IMO, Cell is rarely the best solution.尽管如此,IMO, Cell很少是最好的解决方案。 Interior mutability is a common design in C and many other languages, but it is somewhat more rare in Rust, at least via shared references and Cell , for a number of reasons (eg it's not Sync , and in general people don't expect interior mutability without &mut ).内部可变性是 C 和许多其他语言中的常见设计,但在 Rust 中更为罕见,至少通过共享引用和Cell ,出于多种原因(例如,它不是Sync ,并且通常人们不期望内部没有&mut的可变性)。 Ask yourself why you are using Cell and if it is really impossible to reorganize your code to use normal &mut references.问问自己为什么要使用Cell ,以及是否真的不可能重新组织代码以使用正常的&mut引用。

IMO the bottom line is actually about safety: if no matter what you do, the compiler complains and it seems that you need to use unsafe , then I guarantee you that 99% of the time either: IMO 的底线实际上是关于安全性:如果无论您做什么,编译器都会抱怨并且您似乎需要使用unsafe ,那么我向您保证 99% 的时间:

  1. There's a safe (but possibly complex/unintuitive) way to do it, or有一种安全(但可能很复杂/不直观)的方法可以做到这一点,或者
  2. It's actually undefined behavior (like in this case).它实际上是未定义的行为(就像在这种情况下)。

EDIT : Frxstrem's answer also has better info about when to use Cell / RefCell .编辑Frxstrem 的回答也有关于何时使用Cell / RefCell的更好信息。

Your code is not safe, since you can call c.exec inside c.exec to get two mutable references to the cell contents, as demonstrated by this snippet containing only safe code:您的代码不安全,因为您可以在c.exec中调用c.exec来获取对单元格内容的两个可变引用,如以下仅包含安全代码的片段所示:

let c: MyCell = MyCell::new(5);
c.exec(|n| {
    // need `RefCell` to access mutable reference from within `Fn` closure
    let n = RefCell::new(n);

    c.exec(|m| {
        let n = &mut *n.borrow_mut();
        
        // now `n` and `m` are mutable references to the same data, despite using
        // no unsafe code. this is BAD!
    })
})

In fact, this is exactly the reason why we have both Cell and RefCell :事实上,这正是我们同时拥有CellRefCell的原因:

  • Cell only allows you to get and set a value and does not allow you to get a mutable reference from an immutable one (thus avoiding the above issue), but it does not have any runtime cost. Cell只允许您获取和设置一个值,并且不允许您从不可变的引用中获取可变引用(从而避免了上述问题),但它没有任何运行时成本。
  • RefCell allows you to get a mutable reference from an immutable one, but needs to perform checks at runtime to ensure that this is safe. RefCell允许您从不可变的引用中获取可变引用,但需要在运行时执行检查以确保这是安全的。

As far as I know, there's not really any safe way around this, so you need to make a choice in your code between no runtime cost but less flexibility, and more flexibility but with a small runtime cost.据我所知,实际上并没有任何安全的方法可以解决这个问题,因此您需要在代码中做出选择,即无运行时成本但灵活性较低,以及灵活性更高但运行时成本较低。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM