简体   繁体   中英

C++ OpenMP parallel slower than serial

I have implemented a parallel code in C++ for finding minimum spanning tree with Prim's algorithm using OPENMP. Sometimes it is a little bit faster (7.95 msec) but sometimes I get a speed up of 12.7 msec which is much slower that the serial version (for which i get 9.69 msec) Here there is the parallel version of my code:

https://dpaste.de/dUt6

Can you please help out with this?

Moreover, is there a valid method for testing the performance of my code? time.h seems not to be precise.

Thanks a lot!

OpenMP has an overhead which adds a constant term to the time calculation. Let me give an example.

Let's assume your algorithm finishes in A*n where A is some constant and n is the number of items you will iterate over. Let's also assume that your algorithm parallelizes perfectly so that if you have k threads the parallelized algorithm finishes in O(n)/k time. Due to the OpenMP overhead the time to run will be A*n/k + B where B is the overhead. Therefore, in order for you to see any benfit from OpenMP A*n/k + B < A*n . For some range of values of n [0, threshhold] OpenMP will actually be slower than the serial algoirhtm due to the overhead B .

Another important point is that OpenMP has a different overhead/threshold depending on if it has already been used in the code. I call this the cold and warm thresholds.

dtime_cold = omp_get_wtime();
foo();  //cold  - OpenMP has not been called before
dtime_cold = omp_get_wtime() - dtime_cold;

dtime_warm = omp_get_wtime();
foo(); //warm - OpenMP has already been called once
dtime_warm = omp_get_wtime() - dtime_warm;

If n is large enough then the constant terms are insignficant in which case the thresholds don't matter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM