简体   繁体   中英

Using OpenMP to parallelize a for loop

I'm new to OpenMP. When I parallelize a for loop using

  #pragma omp parallel for num_threads(4)
  for(i=0;i<4;i++){
    //some parallelizable code
  }

Is it guaranteed that every thread takes one and only one value of i ? How is the loop work divided among the threads in general when num_threads is not equal to or does not evenly divide the total number of times of the for loop? Is there a command I can use to specify that each thread takes only one value of i , or the number of values of i each thread takes?

The work division in a loop construct is decided by the schedule . If no schedule clause is present, the def-sched-var schedule is used, which is implementation defined.

You could use schedule (static, 1) , which in your case guarantees that each thread will get exactly one value.

I highly recommend to take a look at the OpenMP specification , Table 2.5 and 2.7.1.1.

There may be legitimate reasons for making this kind of assumptions, but in general the correctness of your loop code should not depend on this. Primarily I would treat this as a performance-hint.

Depending on your use-case you may want to consider tasks or just parallel constructs. If you rely such details for loops, make sure it is well specified in the standard, and not just works in your particular implementation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM