简体   繁体   English

OpenMP中的信令

[英]Signaling in OpenMP

I am writing computational code that more-less has the following schematic: 我正在编写具有以下示意图的计算代码:

#pragma omp parallel
{
    #pragma omp for nowait
    // Compute elements of some array A[i] in parallel

    #pragma omp single
    for (i = 0; i < N; ++i) {
        // Do some operation with A[i].
        // This time it is important that operations are sequential. e.g.:
        result = compute_new_result(result, A[i]);
    }
}

Both computing A[i] and compute_new_result are rather expensive. 计算A[i]compute_new_result都非常昂贵。 So my idea is to compute the array elements in parallel and if any of the threads gets free, it starts doing sequential operations. 因此,我的想法是并行计算数组元素,如果任何线程空闲,它将开始执行顺序操作。 There is a good chance that the starting array elements are already computed and the others will be provided by the other threads doing still the first loop. 很有可能已经计算了开始的数组元素,而其他线程将由仍在第一个循环中的其他线程提供。

However, to make the concept work I have to achieve two things: 但是,要使概念生效,我必须实现两件事:

  1. To make OpenMP split the loops in alternative way, ie for two threads: thread 1 computing A[0] , A[2] , A[4] and thread 2: A[1] , A[3] , A[5] , etc. 要使OpenMP以替代方式拆分循环,即针对两个线程:线程1计算 A[0]A[2]A[4]和线程2: A[1]A[3]A[5]

  2. To provide some signaling system. 提供一些信令系统。 I am thinking about an array of flags indicating that A[i] has already been computed. 我正在考虑一个标志数组,指示A[i]已被计算。 Then compute_new_result should wait for the flag for respective A[i] to be released before proceeding. 然后, compute_new_result在继续之前应等待相应A[i]的标志被释放。

I would be glad for any hints how to achieve both goals. 我很高兴获得如何实现这两个目标的任何提示。 I need the solution to be portable across Linux, Windows and Mac. 我需要该解决方案可在Linux,Windows和Mac上移植。 I am writing the whole code in C++11. 我正在用C ++ 11编写整个代码。


Edit: 编辑:

I have figured out the answer to the fist question. 我已经找到了第一个问题的答案。 It looks like it is sufficient do add schedule(static,1) clause to the #pragma omp for directive. 看起来只要将schedule(static,1)子句添加到#pragma omp for指令中就足够#pragma omp for

However, I am still thinking on the elegant solution of the second issue... 但是,我仍在思考第二期的优雅解决方案...

If you don't mind replacing the OpenMP for worksharing construct with a loop that generates tasks instead, you can use OpenMP task to implement both parts of your application. 如果您不介意将OpenMP for workharing构造替换生成任务的循环,则可以使用OpenMP任务来实现应用程序的两个部分。

In the first loop you would create (instead of the loop chunks), tasks that take on the compute load of the iterations. 在第一个循环中,您将创建任务(而不是循环块)来承担迭代的计算负荷。 Each iteration of the second loop then also becomes an OpenMP task. 然后,第二个循环的每次迭代也变成一个OpenMP任务。 The important part then will be to syncronize the tasks between the different phases. 然后,重要的部分将是同步不同阶段之间的任务。

For that you can use task dependencies (introduce with OpenMP 4.0): 为此,您可以使用任务依赖项(在OpenMP 4.0中引入):

#pragma omp task depend(out:A[0])
{ A[0] = a(); }

#pragma omp task depend(in:A[0])
{ b(A[0]); }

Will make sure that task b does not run before task a has completed. 将确保任务b在任务a完成之前没有运行。

Cheers, -michael 干杯,-迈克尔

This is probably an extended comment rather than an answer ... 这可能是扩展评论,而不是答案...

So, you have a two-phase computation. 因此,您进行了两阶段计算。 In phase 1 you can compute, independently, each entry in your array A . 在阶段1中,您可以独立计算数组A每个条目。 It is straightforward therefore to parallelise this using an OpenMP parallel for loop. 因此,使用OpenMP parallel for循环将其并行化很简单。 But there is an issue here, naive allocations of work to threads are likely to lead to a (severely ?) unbalanced load across threads. 但是这里存在一个问题,对线程的天真工作分配可能会导致(严重?)线程间的不平衡负载。

In phase 2 there is a computation which is not so easily parallelised and which you plan to give to the first thread to finish its share of phase 1. 在阶段2中,有一个计算不那么容易并行化,并且您打算将其分配给第一个线程以完成其在阶段1中的份额。

Personally I'd split this into 2 phases. 我个人将其分为两个阶段。 In the first, use a parallel for loop. 首先,使用parallel for循环。 In the second drop OpenMP and just have a sequential code. 在第二个地方,OpenMP并有一个顺序代码。 Sort out the load balancing within phase 1 by tuning the arguments to a schedule clause; 通过将参数调整为schedule子句来整理阶段1中的负载平衡; I'd be tempted to try schedule(guided) first. 我很想先尝试一下schedule(guided)

If tuning the schedule can't provide the balance you want then investigate replacing parallel for by task -ing. 如果调整计划无法提供您想要的平衡,请研究用task替换parallel for替换。

Do not complicate the code for phase 2 by rolling your own signalling technique. 不要通过滚动自己的信令技术来使第二阶段的代码复杂化。 I'm not concerned that the complication will overwhelm you, though you might be concerned about that, but that the complication will fail to deliver any benefits unless you sort out the load balance in phase 1. And when you've done that you don't need to put phase2 inside an OpenMP parallel region. 我并不担心复杂性会使您不知所措,尽管您可能会担心,但是除非您在第一阶段中对负载平衡进行了梳理,否则复杂性将无法带来任何好处。不需要将phase2放在OpenMP并行区域内。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM