简体   繁体   English

MS并发::: parallel_for()的性能,可进行单次迭代

[英]Performance with MS concurrency::parallel_for() for a single iteration

I suppose there have been numerous questions regarding performance issues while trying to achieve concurrency with parallel_for . 我想在尝试使用parallel_for并发时存在许多有关性能问题的问题。 Even I have noted a performance drop while trying to parallelize memory-access extensive for loops using parallel_for . 即使我已经注意到性能下降,而试图并行存储器访问广泛for使用循环parallel_for The application area that I am working on is image processing. 我正在研究的应用程序领域是图像处理。

Surprisingly this performance drop is seen even if I create a loop with a single iteration to be processed by parallel_for !! 令人惊讶的是,即使我创建了一个包含要由parallel_for处理的单次迭代的循环,这种性能下降仍然可见!

What I mean is that I have a code block as under, which executes in 7 sec without any parallelisation 我的意思是我下面有一个代码块,它在7秒钟内执行,没有任何并行化

<code block without parallelisation>   //(Executes in 7 seconds)

If I enclose the above code within a parallel_for loop as under, the performance increases to 18 seconds. 如果我将上述代码放在如下的parallel_for循环中,则性能会提高到18秒。

parallel_for(0,1,[&](int random_var){   //(Executes in 18 seconds)
<code block without parallelisation> 
});

I completely fail to understand such a behaviour. 我完全无法理解这种行为。 What could cause such a huge overhead for the processing. 是什么可能导致如此巨大的处理开销。 In such a case I assume there should not be any memory bandwidth related issues? 在这种情况下,我假设不应该有任何与内存带宽相关的问题?

Let me know in case you require more information for this specific problem that I am facing. 如果您需要更多有关我面临的特定问题的信息,请告诉我。

because even for one iteration inside parallel_for its going to execute your code in a thread. 因为即使对parallel_进行一次迭代,它也要在线程中执行代码。 so there will be preemption with main thread. 因此主线程会抢占。 also there are other thread related book keeping works, that will take time. 还有其他与线程相关的簿记工作,这需要时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM