简体   繁体   English

Rust 人造丝线程创建垃圾

[英]Rust Rayon Threads create Garbage

I got a larger program that I can summarize like:我有一个更大的程序,我可以总结如下:

SequentialPart
ThreadPoolParallelized
SequentialPart
ParallelPartInQuestion
SequentialPart

This code gets called in sequence many times.此代码按顺序多次调用。

I'm using Rayon threads to parallelize the second part such like:我正在使用人造丝线程来并行化第二部分,例如:

                final_results = (0..num_txns).into_par_iter()
                    .filter_map(|idx| {
                        if !matches!(ret, None) {
                            return None;
                        }
                        match last_input_output.take_output(idx) {
                            ExecutionStatus::Success(t) => Some(t),
                            ExecutionStatus::SkipRest(t) => {
                                Some(t)
                            }
                            ExecutionStatus::Abort(err) => {
                                None
                            }
                        }
                    }).collect();

I've also done this already using parallel chunks我也已经使用并行块完成了这项工作

 let interm_result: Vec<ExtrResult<E>> = (0..num_txns)
                .collect::<Vec<TxnIndex>>()
                .par_chunks(chunk_size)
                .map(|chunk| {

Either way, I noticed that the first time this code runs, everything works as expected and I get a decent performance boost out of it.无论哪种方式,我注意到这段代码第一次运行时,一切都按预期工作,并且我从中获得了不错的性能提升。

However, on the second iteration the first parallel piece of code (ThreadPoolParallelized) runs around 20% slower every time.然而,在第二次迭代中,第一个并行代码(ThreadPoolParallelized)每次运行的速度都慢了大约 20%。

So I concluded that somehow Rayon must leave something behind which has to be cleaned up afterwards resulting in this performance drop.所以我得出的结论是,人造丝必须以某种方式留下一些必须在之后清理的东西,从而导致性能下降。

is there something I can do about this?我能做些什么吗?

Edit: What the take_output does is:编辑: take_output 的作用是:

    outputs: Vec<CachePadded<ArcSwapOption<TxnOutput<T, E>>>>, // txn_idx -> output.

    pub fn take_output(&self, txn_idx: TxnIndex) -> ExecutionStatus<T, Error<E>> {
        let owning_ptr = self.outputs[txn_idx]
            .swap(None)
            .expect("Output must be recorded after execution");

        if let Ok(output) = Arc::try_unwrap(owning_ptr) {
            output
        } else {
            unreachable!("Output should be uniquely owned after execution");
        }
    }

I figured out what was causing the Problem.我弄清楚是什么导致了问题。 The first parallel part in this execution used a manually created threadpool.此执行中的第一个并行部分使用手动创建的线程池。 However, into_par_iter uses the global threadpool if not otherwise specified and keeps it alive for some time.但是,如果没有另外指定, into_par_iter将使用全局线程池并使其保持活动状态一段时间。 This interferes with the manually created threadpool这会干扰手动创建的线程池

 let interm_result: Vec<ExtrResult<E>> = RAYON_EXEC_POOL.install(|| {
                (0..num_txns)

By specifically wrapping the code that is supposed to be executed in parallel in the pool.install call it re-uses the same threadpool, doesn't create an additional one that has to be destroyed with an overhead later and preserves performance.通过专门包装应该在pool.install调用中并行执行的代码,它可以重新使用相同的线程池,不会创建一个额外的线程池,以后必须通过开销销毁并保持性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM