简体   繁体   中英

OpenMP about for loop

I have a OpenMP code snippet as follows:

#ifdef _OPENMP
  #pragma omp parallel for default(none) \
  private(i, a_output) \
  shared(n, t_input, t0, trace_amp)
#endif
    for (i = 0; i < n; i++){
        if( t_input >= t0[i] )
        {

                a_output = trace_amp[i];

                return a_output;
        }
    }

Is this code correct? Why a_output has to be private? Can it be shared?

As @1201ProgramAlarm said, you cannot have a return statement inside a parallel region. The compilers do not even compile the code

$ gcc Untitled-1.c -fopenmp
Untitled-1.c: In function ‘main’:
Untitled-1.c:7:20: error: invalid branch to/from OpenMP structured block
             return a_output;

$ clang Untitled-1.c -fopenmp=libomp
Untitled-1.c:7:13: error: cannot return from OpenMP region
            return a_output;
            ^
1 error generated.

However, the version 4.0 of the OpenMP specification brings a new directive: cancel . With it, you can interrupt the execution of a parallel region. It adds a bit of overhead to the total execution time because in each iteration the threads test whether they have to stop or not.

You can try rewrite the original for-loop this way:

  a_output = 0;
#ifdef _OPENMP
  #pragma omp parallel default(none) \
  private(i) \
  shared(n, t_input, t0, trace_amp, a_output)
  #pragma omp for
#endif
  for (i = 0; i < n; i++){
      if( t_input >= t0[i] ){
              a_output = trace_amp[i];
              #pragma omp cancel for
      }
      #pragma omp cancellation point for

  }
  return a_output;

You must separate the for from parallel to avoid use a implicit nowait clause on for

EDIT: as stated by @Zulan

1) There must be at least one cancellation point that all threads can naturally reach if a cancellation should occur. Although the cancel directive itself has a cancellation point by definition, it is within an if statement that may not be accessed by all threads. The solution to this is to add a cancellation point outside the if statement. I changed the code to match this.

2) The cancellation is disabled by default in most runtimes. To enable, one should set the OMP_CANCELLATION environment variable to true.

3) There is still a race condition in a_output . Unless you are sure that there is only one value less than t_input within t0 , there is a possibility that two or more threads will write to a_output before cancellation occurs. You should review the logic behind your code to confirm if this is a problem or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM