My OpenMP (C) parallel code is slower than the serial bitonic sort code and although I have done all the possible combinations I could think of, I can't find the problem.
Here is part of the code:
void recBitonicSortOpenMP( int lo, int cnt, int dir ) {
if ( cnt > 1 ) {
int k = cnt / 2;
#pragma omp parallel sections
{
#pragma omp section
recBitonicSortOpenMP( lo, k, ASCENDING );
#pragma omp section
recBitonicSortOpenMP( lo + k, k, DESCENDING );
}
bitonicMergeOpenMP( lo, cnt, dir );
}
}
It is an implementation of Bitonic Sort.
For recursive problems like this use OpenMP tasks, not nested parallelism and sections. http://cs.umw.edu/~finlayson/class/fall16/cpsc425/notes/24-sorting.html gives you a good example to follow...
You should try to parallelize the imperative version of the algorithm. I think the recursive version is an inherently serial problem.
There are many pragmas useful to parallelize "For" cycles.
http://www.openmp.org/specifications/
Two version of the algorithm:
https://www.cs.duke.edu/courses/fall08/cps196.1/Pthreads/bitonic.c
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.