简体   繁体   中英

openmp shared or nothing. private vs uninitialized

Is there a difference between these two implementations of openmp?

float dot_prod (float* a, float* b, int N)
{
float sum = 0.0;
#pragma omp parallel for shared(sum)
for (int i = 0; i < N; i++) {
  #pragma omp critical
  sum += a[i] * b[i];
  }
return sum;
}

and the same code but line 4 doesn't have the shared(sum) because sum is already initialized?

#pragma omp parallel for
for(int = 0; ....)

Same question for private in openmp:

Is

void work(float* c, int N)
{
float x, y; int i;
#pragma omp parallel for private(x,y)
for (i = 0; i < N; i++)
{
  x = a[i]; y = b[i];
  c[i] = x + y;
  }
}

the same as without the private(x,y) because x and y aren't initialized?

#pragma omp parallel for 

Is there a difference between these two implementations of openmp?

float dot_prod (float* a, float* b, int N)
{
  float sum = 0.0;
# pragma omp parallel for shared(sum)
  for (int i = 0; i < N; i++) {
    #pragma omp critical
    sum += a[i] * b[i];
  }
  return sum;
}

In openMP a variable declared outside the parallel scope is shared , unless it is explicitly rendered private . Hence the shared declaration can be omitted.

But your code is far from being optimal. It works, but will be by far slower than its sequential counterpart, because critical will force sequential processing and creating a critical section has an important temporal cost.

The proper implementation would use a reduction .

float dot_prod (float* a, float* b, int N)
{
  float sum = 0.0;
# pragma omp parallel for reduction(+:sum)
  for (int i = 0; i < N; i++) {
    sum += a[i] * b[i];
  }
  return sum;
}

The reduction creates a hidden local variable to accumulate in parallel in every thread and before thread destruction performs an atomic addition of these local sums on the shared variable sum .

Same question for private in openmp:

void work(float* c, int N)
{
  float x, y; int i;
# pragma omp parallel for private(x,y)
  for (i = 0; i < N; i++)
  {
    x = a[i]; y = b[i];
    c[i] = x + y;
  }
}

By default, x and y are shared. So without private the behaviour will be different (and buggy because all threads will modify the same globally accessible vars x and y without an atomic access).

the same as without the private(x,y) because x and y aren't initialized?

Initialization of x and y does not matter, what is important is where they are declared. To insure proper behavior, they must be rendered private and the code will be correct as x and y are set before been used in the loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM