简体   繁体   English

线程中使用的所有内容都必须在 Rust 中“发送”吗?

[英]Must everything used in thread be `Send`able in Rust?

I'm trying to implement concurrent processing in Rust.我正在尝试在 Rust 中实现并发处理。 Here is the (simplified) code ( playground ):这是(简化的)代码( 游乐场):

struct Checker {
    unsendable: std::rc::Rc<String> // `Rc` does not implement Send + Sync (`Rc` just for simplicity, a graph actually)
}

impl Checker {
    pub fn check(&self, input: String) -> bool {
        // some logics (no matter)
        true
    }
}

struct ParallelChecker {
    allow_checker: Checker,
    block_checker: Checker
}

impl ParallelChecker {
    // `String` for simplicity here
    pub fn check(&self, input: String) -> bool {
        let thread_pool = rayon::ThreadPoolBuilder::new()
            .num_threads(2)
            .build()
            .unwrap();

        // scoped thread pool for simplicity (to avoid `self` is not `'static`)
        thread_pool.scope(move |s| {
            s.spawn(|_| {
                let result = self.allow_checker.check(input.clone());
                // + send result via channel
            });
        });

        thread_pool.scope(move |s| {
            s.spawn(|_| {
                let result = self.block_checker.check(input.clone());
                // + send result via channel
            });
        });

        true // for simplicity
        // + receive result form the channels
    }
}

The problem is that self.allow_checker and self.block_checker don't implement Send :问题是self.allow_checkerself.block_checker没有实现Send

55 |         thread_pool.scope(move |s| {
   |                     ^^^^^ `Rc<String>` cannot be sent between threads safely
   |
   = help: within `Checker`, the trait `Send` is not implemented for `Rc<String>`
   = note: required because it appears within the type `Checker`
   = note: required because of the requirements on the impl of `Send` for `Mutex<Checker>`
   = note: 1 redundant requirements hidden
   = note: required because of the requirements on the impl of `Send` for `Arc<Mutex<Checker>>`
   = note: required because it appears within the type `[closure@src/parallel_checker.rs:55:27: 60:10]`

I was under the impression that only something that is sent via the channels must implement Send + Sync and i'm probably wrong here.我的印象是只有通过渠道发送的东西必须实现Send + Sync ,我在这里可能错了。

As you can see the threads code does not have any shared variables (except self ).如您所见,线程代码没有任何共享变量( self除外)。 How can i make it working without implementing Send and paying no costs on sync?如何在不实施Send且不支付同步费用的情况下使其正常工作?

I've tried to synchronize the access (though it looks like useless as there is no shared variable in threads), but no luck:我试图同步访问(虽然它看起来没用,因为线程中没有共享变量),但没有运气:

    let allow_checker = Arc::new(Mutex::new(self.allow_checker));
        thread_pool.scope(move |s| {
            s.spawn(|_| {
                let result = allow_checker.lock().unwrap().check(input.clone());
                // + send result via channel
            });
        });

PS. PS。 Migrating to Arc is highly undesired due to performance issue and lot's of related code to be migrated to Arc too.由于性能问题以及大量相关代码也将迁移到Arc ,因此非常不希望迁移到Arc

I was under the impression that only something that is sent via the channels must implement Send + Sync and I'm probably wrong here.我的印象是只有通过渠道发送的东西必须实现Send + Sync ,我在这里可能错了。

You are slightly wrong:有点错误:

  • A type is Send if its values can be sent across threads.如果其值可以跨线程发送,则该类型为Send Many types are Send , String is, for example.例如,许多类型是SendString是。
  • A type is Sync if references to its values can be accessed from multiple threads without incurring any data-race.如果可以从多个线程访问对其值的引用而不会引起任何数据争用,则该类型为Sync Perhaps surprisingly this means that String is Sync -- by virtue of being immutable when shared -- and in general T is Sync for any &T that is Send .也许令人惊讶的是,这意味着StringSync ——由于在共享时是不可变的——并且通常T对于任何作为Send&T都是Sync

Note that those rules do not care how values are sent or shared across threads, only that they are.请注意,这些规则不关心值如何跨线程发送或共享,只关心它们。

This is important here because the closure you use to start a thread is itself sent across threads: it is created on the "launcher" thread, and executed on the "launched" thread.这在这里很重要,因为用于启动线程的闭包本身是跨线程发送的:它在“启动器”线程上创建,并在“启动”线程上执行。

As a result, this closure must be Send , and this means that anything it captures must in turn be Send .因此,这个闭包必须是Send ,这意味着它捕获的任何东西都必须是Send

Why doesn't Rc implement Send ?为什么Rc不实现Send

Because its reference count is non-atomic.因为它的引用计数是非原子的。

That is, if you have:也就是说,如果您有:

  • Thread A: one Rc .线程 A:一个Rc
  • Thread B: one Rc (same pointee).线程 B:一个Rc (相同的指针)。

And then Thread A drops its handle at the same time Thread B creates a clone, you'd expect the count to be 2 (still) but due to non-atomic accesses it could be only 1 despite 2 handles still existing:然后线程 A 在线程 B 创建克隆的同时丢弃其句柄,您希望计数为 2(仍然),但由于非原子访问,尽管仍然存在 2 个句柄,但它可能只有 1:

  • Thread A reads count (2).线程 A 读取计数 (2)。
  • Thread B reads count (2).线程 B 读取计数 (2)。
  • Thread B writes incremented count (3).线程 B 写入递增计数 (3)。
  • Thread A writes decremented count (1).线程 A 写入递减计数 (1)。

And then, the next time B drops a handle, the item is destructed and the memory released and any further access via the last handle will blow up in your face.然后,下一次 B 放下把手时,物品被破坏,memory 被释放,任何通过最后一个把手的进一步访问都会炸毁你的脸。

I've tried to synchronize the access (though it looks like useless as there is no shared variable in threads), but no luck:我试图同步访问(虽然它看起来没用,因为线程中没有共享变量),但没有运气:

You can't wrap a type to make it Send , it doesn't help because it doesn't change the fundamental properties.您不能包装类型以使其成为Send ,它没有帮助,因为它不会更改基本属性。

The above race condition on Rc could happen even with a Rc wrapped in Arc<Mutex<...>> .即使Rc包含在Arc<Mutex<...>>中,上述Rc上的竞争条件也可能发生。

And therefore !Send is contagious and "infects" any containing type.因此!Send具有传染性并“感染”任何包含类型。

Migrating to Arc is highly undesired due to performance issue and lot's of related code to be migrated to Arc too.由于性能问题以及大量相关代码也将迁移到Arc ,因此非常不希望迁移到Arc

Arc itself has relatively little performance overhead, so it seems unlikely it would matter unless you keep cloning those Checker , which you could probably improve on -- passing references, instead of clones. Arc本身的性能开销相对较小,因此除非您继续克隆那些Checker ,否则它似乎不太重要,您可能会对其进行改进——传递引用而不是克隆。

The higher overhead here will come from Mutex (or RwLock ) if Checker is not Sync .如果Checker不是Sync ,这里更高的开销将来自Mutex (或RwLock )。 As I mentioned, any immutable value is trivially Sync , so if you can refactor the internal state of Checker to be Sync , then you can avoid the Mutex entirely and just have Checker contain Arc<State> .正如我所提到的,任何不可变的值都是微不足道的Sync ,所以如果你可以将Checker的内部 state 重构为Sync ,那么你可以完全避免Mutex并且只让Checker包含Arc<State>

If you have mutable state at the moment, consider extracting it, going towards:如果您目前有可变的 state,请考虑将其提取出来,朝着:

struct Checker {
    unsendable: Arc<Immutable>,
}

impl Checker {
    pub fn check(&self, input: String) -> Mutable {
        unimplemented!()
    }
}

Or drop the idea of parallel checks.或者放弃并行检查的想法。

The requirement for Send is not limited to channels. Send的要求不限于频道。 If you look at the documentation for thread::spawn , you will see that the closure it takes must be Send , and therefore can only capture Send values.如果您查看thread::spawn的文档,您会看到它采用的闭包必须是Send ,因此只能捕获Send值。 The same applies to the closure taken by rayon::ThreadPool::scope .这同样适用于rayon::ThreadPool::scope的闭包。

Since you're cloning the strings in the closures, you might as well clone them before spawning the threads:由于您正在克隆闭包中的字符串,因此您不妨在生成线程之前克隆它们:

let input_clone = input.clone();
thread_pool.scope(move |s| {
    s.spawn(|_| {
        let result = self.allow_checker.check(input_clone);
        // + send result via channel
    });
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM