简体   繁体   中英

OpenMP calls and directives allowed in firstprivate variable construction?

I have the following code which works on the compilers I have available (xlC and gcc) but I don't know if it is fully compliant (I didn't find anything in the OpenMP 3.0 spec that explicitly disallows it):

#include <iostream>
#include <vector>
#include <omp.h>

struct A {
  int tid;
  A() : tid(-1) { }
  A(const A&) { tid = omp_get_thread_num(); }
};

int main() {
  A a;

  std::vector<int> v(10);
  std::vector<int>::iterator it;
#pragma omp parallel for firstprivate(a)
  for (it=v.begin(); it<v.end(); ++it)
    *it += a.tid;

  for (it=v.begin(); it<v.end(); ++it)
    std::cout << *it << ' ';
  std::cout << std::endl;
  return 0;
}

My motivation is to figure out how many threads and each thread's id in the omp parallel for section (I do not wish to call it for each element that is being processed though). Is there any chance that I'm causing undefined behavior?

I would just decouple (start of) the parallel region from the loop, and use private variable to keep tid:

std::vector<int>::iterator it;
int tid;
#pragma omp parallel private(tid)
{
    tid = omp_get_thread_num();
    #pragma omp for 
    for (it=v.begin(); it<v.end(); ++it)
        *it += tid; 
}

Added: below are the quotes from the OpenMP specification (Section 2.9.3.4) that make me think your code is conformant and so does not produce UB (however see another addition below):

... the new list item is initialized from the original list item existing before the construct. The initialization of the new list item is done once for each task that references the list item in any statement in the construct. The initialization is done prior to the execution of the construct.

For a firstprivate clause on a parallel or task construct, the initial value of the new list item is the value of the original list item that exists immediately prior to the construct in the task region where the construct is encountered.

C/C++: ... For variables of class type, a copy constructor is invoked to perform the initialization. The order in which copy constructors for different variables of class type are called is unspecified.

C/C++: A variable of class type (or array thereof) that appears in a firstprivate clause requires an accessible, unambiguous copy constructor for the class type.

Added-2: However, it is not specified which thread executes the copy constructor for a firstprivate variable. So in theory, it can be done by the master thread of the region for all copies of the variable. In this case, the value of omp_get_thread_num() will be equal in all copies, either 0 or, in case of nested parallel regions, the thread number in the outer region. So, being a defined behavior from OpenMP standpoint, it may result in a data race in your program.

When you iterate through the vector, you should be using it != v.end(), and not it < v.end(). However, in this case your parallel for loop is no longer valid. I would restructure that section of the code in the following manner:

  #pragma omp parallel for firstprivate(a)
  for (int i = 0 ; i < v.size() ; i++ )
     v[i] += a.tid;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM