简体繁体中英

Why is my OpenMP implementation slower than a single threaded implementation? (Followup)

原文 2011-02-18 16:09:19 9 1 c/ openmp

This is a follow up to Why is my OpenMP implementation slower than a single threaded implementation? .

I have adhered to the answer provided, and used tasking instead of for pragmas to speed up the code. However, compared to a sequential (same) program, both programs run equally as fast. I witness no speed up.

The reworked code is here: http://pastebin.com/3SFaNEc4

I simply removed all the for pragmas and replaced it tasking pragmas for the recursive procedures.

Am I doing anything wrong? I should be seeing an almost linear speed up. What do you guys think?

Thanks!

1 answers

First - you still have an "#pragma end critical" which should be removed. It isn't causing a problem, but it is incorrect. Second - as I said in the other question you posted, you might have to think about how you are parallelizing the code to see the speedup, so just replacing the other pragmas with task pragmas may not speed it up. Third - you haven't put the tasks into a parallel region, so you are not running in parallel at all. And you can't just add a parallel region around the tasks or you are going to be doing the same tasks multiple times.

Why is my OpenMP implementation slower than a single threaded implementation?

Quicksort - why is my dutch-flag implementation slower than my Hoare-2-partition implementation?

Why is this implementation of Quick Sort slower than qsort?

Why MPI and OpenMP Merge Sort are slower than my sequential code?

Why is my implementation of selectionSort faster than my implementation of bubbleSort?

Dijkstra Algorithm OpenMP Slower than Single Thread

OpenMP sections run slower than single thread

Why POSIX Threads are Slower Than OpenMP

OpenMP and GSL RNG - Performance Issue - 4 threads implementation 10x slower than pure sequential one (quadcore CPU)

Modulo operator slower than manual implementation?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Why is my OpenMP implementation slower than a single threaded implementation? Quicksort - why is my dutch-flag implementation slower than my Hoare-2-partition implementation? Why is this implementation of Quick Sort slower than qsort? Why MPI and OpenMP Merge Sort are slower than my sequential code? Why is my implementation of selectionSort faster than my implementation of bubbleSort? Dijkstra Algorithm OpenMP Slower than Single Thread OpenMP sections run slower than single thread Why POSIX Threads are Slower Than OpenMP OpenMP and GSL RNG - Performance Issue - 4 threads implementation 10x slower than pure sequential one (quadcore CPU) Modulo operator slower than manual implementation?

Related Tags

Why is my OpenMP implementation slower than a single threaded implementation? (Followup)

Question

1 answers

solution1 2 ACCPTED 2011-02-18 16:34:48

solution1
2 ACCPTED 2011-02-18 16:34:48