I have implemented a parallel code in C++ for finding minimum spanning tree with Prim's algorithm using OPENMP. Sometimes it is a little bit faster (7.95 msec) but sometimes I get a speed up of 12.7 msec which is much slower that the serial version (for which i get 9.69 msec) Here there is the parallel version of my code:
Can you please help out with this?
Moreover, is there a valid method for testing the performance of my code? time.h seems not to be precise.
Thanks a lot!
OpenMP has an overhead which adds a constant term to the time calculation. Let me give an example.
Let's assume your algorithm finishes in A*n
where A is some constant and n is the number of items you will iterate over. Let's also assume that your algorithm parallelizes perfectly so that if you have k
threads the parallelized algorithm finishes in O(n)/k
time. Due to the OpenMP overhead the time to run will be A*n/k + B
where B is the overhead. Therefore, in order for you to see any benfit from OpenMP A*n/k + B < A*n
. For some range of values of n [0, threshhold] OpenMP will actually be slower than the serial algoirhtm due to the overhead B
.
Another important point is that OpenMP has a different overhead/threshold depending on if it has already been used in the code. I call this the cold and warm thresholds.
dtime_cold = omp_get_wtime();
foo(); //cold - OpenMP has not been called before
dtime_cold = omp_get_wtime() - dtime_cold;
dtime_warm = omp_get_wtime();
foo(); //warm - OpenMP has already been called once
dtime_warm = omp_get_wtime() - dtime_warm;
If n
is large enough then the constant terms are insignficant in which case the thresholds don't matter.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.