I need to parallelize the following:
for(i=0; i<n/2; i++)
a[i] = a[i+1] + a[2*i]
In parallel, the output will be different than in sequential, because the "to read" values will be "rewritten". In order to get the sequential output, but then parallelized I want to make use of firstprivate(a). Because firstprivate gives each tread a copy of a.
Let's imagine 4 threads and a loop of 100.
That means that each tread will rewrite 25% of the array.
When the parallel region is over, all the threads "merge". Does that mean that you get the same a as if you ran it in sequential?
#pragma omp parallel for firstprivate(a)
for(i=0; i<n/2; i++)
a[i] = a[i+1] + a[2*i]
Question:
As you noted, using firstprivate
to copy the data for each thread does not really help you getting the data back.
The easiest solution is in fact to separate input and output and have both be shared (the default).
In order to avoid a copy it would be good to just use the new variable instead of b from thereon in the code. Alternatively you could just have pointers and swap them.
int out[100];
#pragma omp parallel for
for(i=0; i<n/2; i++)
out[i] = a[i+1] + a[2*i]
// use out from here when you would have used a.
There is no easy and general way to have private copies of a
for each thread and then merge them afterwards. lastprivate
just copies one incomplete output array from the thread executing the last iteration and reduction
doesn't know which elements to take from which array. Even if it was, it would be wasteful to copy the entire array for each thread. Having shared in-/outputs here is much more efficient.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.