简体繁体 English

我应该在openMP并行区域内使用gnu并行模式函数（for-loop，tasks）

[英]Should I use gnu parallel mode function inside openMP parallel region(for-loop, tasks)

原文 2015-05-31 03:32:09 1 1 c++/ multithreading/ openmp

I have a program accelerated by openMP , inside the parallel region, functions like std::nth_element , std::sort , std::partition are called. 我有一个由openMP加速的程序，在并行区域内，调用std::nth_element ， std::sort ， std::partition等函数。 actually, these functions are used to process each openmp-thread's corresponding part of an array. 实际上，这些函数用于处理每个openmp-thread对应的数组部分。

recently, I found g++ had implemented parallel version of above functions, So I wonder should I use function like __gnu_parallel::nth_element inside #pragma omp task or #pragma omp for region? 最近，我发现g ++已经实现了上述函数的并行版本，所以我想我应该在#pragma omp task或#pragma omp for使用__gnu_parallel::nth_element等函数#pragma omp for区域吗？ if I used the parallel mode, would the total threads exceed the limit set by omp_set_num_threads() and lead to worse speedup? 如果我使用并行模式，总线程是否会超过omp_set_num_threads()设置的限制并导致omp_set_num_threads()加速？

1 个解决方案

Trivial (and best) answer: Benchmark and post your findings. 琐碎（和最好）答案：基准测试并发布您的发现。

Less definitive: In my experience, the parallel versions of most algorithms are less efficient than the comparable serial ones, instead relying on multiple parallel processors to compensate in wall time. 不太明确：根据我的经验， 大多数算法的并行版本效率低于可比较的串行版本，而是依靠多个并行处理器来补偿壁挂时间。 Regarding the number of threads, I don't think that OMP will spawn new threads if at the limit. 关于线程数，我不认为OMP会在极限情况下产生新线程。 I do remember that embedded #pragma omp for regions don't actually result in each of the outer threads spawning more "inner threads" without a specific flag (which I don't remember off the top of my head). 我确实记得嵌入式#pragma omp for regions实际上不会导致每个外部线程产生更多的“内部线程”而没有特定的标志（我不记得我的头顶）。