简体   繁体   English

我可以将多个有序语句放在一个有序循环(OpenMP)中吗?

[英]Can I put multiple ordered statements in one ordered for loop (OpenMP)?

I just found out that while this C code gives an ordered list of integers (as expected): 我刚刚发现,尽管此C代码给出了一个有序的整数列表(如预期的那样):

#include <stdio.h>
#include <unistd.h>
#include <omp.h>

int main() {
#pragma omp parallel for ordered schedule(dynamic)
  for (int i=0; i<10; i++) {
#pragma omp ordered
    {
    printf("%i             (tid=%i)\n",i,omp_get_thread_num(); fflush(stdout);
    }
  }
}

With both gcc as well as icc, the following gives undesired behaviour: 在同时使用gcc和icc的情况下,以下内容会产生不良行为:

#include <stdio.h>
#include <unistd.h>
#include <omp.h>

int main() {
#pragma omp parallel for ordered schedule(dynamic)
  for (int i=0; i<10; i++) {
#pragma omp ordered
    {
    printf("%i             (tid=%i)\n",i,omp_get_thread_num()); fflush(stdout);
    }

    usleep(100*omp_get_thread_num());
    printf("WORK IS DONE  (tid=%i)\n",omp_get_thread_num()); fflush(stdout);
    usleep(100*omp_get_thread_num());

#pragma omp ordered
    {
    printf("  %i           (tid=%i)\n",i,omp_get_thread_num()); fflush(stdout);
    }
  }
} 

What I'd love to see is: 我很想看到的是:
0 0
1 1个
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
WORK IS DONE 工作已经完成
0 0
1 1个
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9

But with gcc is get: 但是用gcc可以得到:
0 (tid=5) 0(tid = 5)
WORK IS DONE (tid=5) 工作已完成(tid = 5)
0 (tid=5) 0(tid = 5)
1 (tid=2) 1(tid = 2)
WORK IS DONE (tid=2) 工作已完成(tid = 2)
1 (tid=2) 1(tid = 2)
2 (tid=0) 2(tid = 0)
WORK IS DONE (tid=0) 工作已完成(tid = 0)
2 (tid=0) 2(tid = 0)
3 (tid=6) 3(tid = 6)
WORK IS DONE (tid=6) 工作已完成(tid = 6)
3 (tid=6) 3(tid = 6)
4 (tid=7) 4(tid = 7)
WORK IS DONE (tid=7) 工作已完成(tid = 7)
4 (tid=7) 4(tid = 7)
5 (tid=3) 5(tid = 3)
WORK IS DONE (tid=3) 工作已完成(tid = 3)
5 (tid=3) 5(tid = 3)
6 (tid=4) 6(tid = 4)
WORK IS DONE (tid=4) 工作已完成(tid = 4)
6 (tid=4) 6(tid = 4)
7 (tid=1) 7(tid = 1)
WORK IS DONE (tid=1) 工作已完成(tid = 1)
7 (tid=1) 7(tid = 1)
8 (tid=5) 8(tid = 5)
WORK IS DONE (tid=5) 工作已完成(tid = 5)
8 (tid=5) 8(tid = 5)
9 (tid=2) 9(tid = 2)
WORK IS DONE (tid=2) 工作已完成(tid = 2)
9 (tid=2) 9(tid = 2)
(so everything get's ordered - even the parallelizable work part) (因此一切都得到了排序-甚至是可并行化的工作部件)

And with icc: 并使用icc:
1 (tid=0) 1(tid = 0)
2 (tid=5) 2(tid = 5)
3 (tid=1) 3(tid = 1)
4 (tid=2) 4(tid = 2)
WORK IS DONE (tid=1) 工作已完成(tid = 1)
WORK IS DONE (tid=3) 工作已完成(tid = 3)
3 (tid=1) 3(tid = 1)
6 (tid=4) 6(tid = 4)
7 (tid=7) 7(tid = 7)
8 (tid=1) 8(tid = 1)
WORK IS DONE (tid=0) 工作已完成(tid = 0)
5 (tid=6) 5(tid = 6)
WORK IS DONE (tid=2) 工作已完成(tid = 2)
1 (tid=0) 1(tid = 0)
9 (tid=0) 9(tid = 0)
WORK IS DONE (tid=0) 工作已完成(tid = 0)
WORK IS DONE (tid=5) 工作已完成(tid = 5)
WORK IS DONE (tid=1) 工作已完成(tid = 1)
9 (tid=0) 9(tid = 0)
0 (tid=3) 0(tid = 3)
8 (tid=1) 8(tid = 1)
WORK IS DONE (tid=4) 工作已完成(tid = 4)
WORK IS DONE (tid=6) 工作已完成(tid = 6)
2 (tid=5) 2(tid = 5)
WORK IS DONE (tid=7) 工作已完成(tid = 7)
6 (tid=4) 6(tid = 4)
5 (tid=6) 5(tid = 6)
4 (tid=2) 4(tid = 2)
7 (tid=7) 7(tid = 7)
(so nothing get's ordered not even the ordered clauses) (因此,没有什么是有序的,甚至没有有序的子句)

Is using multiple ordered clauses within one ordered loop undefined behaviour or what is going on here? 是在一个有序循环中使用多个有序子句未定义的行为,还是发生了什么? I couldn't find anything disallowing multiple clauses per loop in any of the OpenMP documentations I could find. 在我可以找到的任何OpenMP文档中,我找不到任何不允许每个循环包含多个子句的内容。

I know that in this trivial example I could just part the loops like 我知道在这个简单的示例中,我可以像

int main() {  
  for (int i=0; i<10; i++) {  
    printf("%i             (tid=%i)\n",i,omp_get_thread_num()); fflush(stdout);  
  }  
#pragma omp parallel for schedule(dynamic)  
  for (int i=0; i<10; i++) {  
    usleep(100*omp_get_thread_num());  
    printf("WORK IS DONE  (tid=%i)\n",omp_get_thread_num()); fflush(stdout);  
    usleep(100*omp_get_thread_num());  
  }  
  for (int i=0; i<10; i++) {  
    printf("  %i           (tid=%i)\n",i,omp_get_thread_num()); fflush(stdout);  
  }          
}  

So I'm not looking for a workaround. 所以我不是在寻找解决方法。 I really want to understand what is going on here, so that I can handle the real situation without running into anything devastating/unexpected. 我真的很想了解这里发生的事情,这样我就可以处理实际情况,而不会遇到任何破坏性/意外的事情。

I really hope you can help me. 我真的希望你能帮助我。

According to OpenMP 4.0 API specifications you can't. 根据OpenMP 4.0 API规范,您不能这样做。

Only one ordered clause can appear on a loop directive (p. 58) 循环指令中只能出现一个有序子句(第58页)

I am a little new in parallel programming, but I will try to help you. 我在并行编程方面有点新手,但我会尽力为您提供帮助。

I have modified your code and tested this one: 我已经修改了您的代码并测试了这一代码:

#include <stdio.h>
#include <unistd.h>
#include <omp.h>

int main() {

  #pragma omp parallel num_threads(8)
  {

    #pragma omp for ordered schedule(dynamic)
    for (int i=0; i<10; i++) {

          #pragma omp ordered
          printf("%i (tid=%i) \n",i,omp_get_thread_num()); fflush(stdout);

    }

    printf("WORK IS DONE  (tid=%i)\n",omp_get_thread_num()); fflush(stdout);

  }


}

Adapt the number of threads to the machine you are using to compile your examples.The problem in your code is that the access to the printf indicating that work is done is being done randomly, every thread will execute this part independently. 使线程数适应您用于编译示例的计算机。代码中的问题是,对表示工作已完成的printf的访问是随机进行的,每个线程将独立执行此部分。 In my example, I let the iterations of the for loop be executed as the ordered clause states, and then the for's clause implicit barrier keeps every thread waiting until all of them have reached the position of code right after both the for loop and the for clause, and then each one prints out "work is done". 在我的示例中,我让for循环的迭代按照有序子句的状态执行,然后for的子句隐式屏障使每个线程一直等待,直到所有线程都在for循环和for之后都到达了代码位置为止。子句,然后每个都打印出“工作已完成”。 If you are not using a for clause and you want to get the same output, you can use an explicit barrier or, in other, words, #pragma omp barrier. 如果您没有使用for子句,并且希望获得相同的输出,则可以使用显式屏障,或者换句话说,使用#pragma omp屏障。

Note: "pragma omp parallel" does also use an implicit barrier, after which every thread that has been created is destroyed 注意:“ pragma omp parallel”也使用隐式屏障,在此之后,已创建的每个线程均被销毁

Here is a possible output I obtained: 这是我获得的可能输出:

0 (tid=7) 
1 (tid=5) 
2 (tid=0) 
3 (tid=4) 
4 (tid=1) 
5 (tid=3) 
6 (tid=2) 
7 (tid=7) 
8 (tid=5) 
9 (tid=0) 

WORK IS DONE  (tid=5)
WORK IS DONE  (tid=2)
WORK IS DONE  (tid=1)
WORK IS DONE  (tid=4)
WORK IS DONE  (tid=0)
WORK IS DONE  (tid=7)
WORK IS DONE  (tid=3)
WORK IS DONE  (tid=6)

If this is the kind of output you would like to see, this is a possible way of achieving it. 如果您希望看到这种输出,则这是实现该输出的一种可能方式。 Hope this helps, and do not hesitate to ask for further help if necessary. 希望这会有所帮助,如有必要,请随时寻求进一步的帮助。 Keep coding! 继续编码!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM