简体   繁体   English

如何编写在C ++ openmp上重用线程的代码?

[英]How can I write codes that reuse thread on C++ openmp?

I have a function f that I can use parallel processing. 我有一个函数f,我可以使用并行处理。 For this purpose, I used openmp. 为此,我使用了openmp。 However, this function is called many times, and it seems that thread creation is done every call. 但是,此函数被调用了很多次,似乎每次调用都完成了线程的创建。

How can we reuse the thread? 我们如何重用线程?

void f(X &src, Y &dest) {
   ... // do processing based on "src"
   #pragma omp parallel for
   for (...) { 

   }
   ...// put output into "dest" 
}

int main() {
    ...
    for(...) { // It is impossible to make this loop processing parallel one.
       f(...);
    }
    ...
    return 0;
}

OpenMP implements thread pool internally, it tries to reuse threads unless you change some of its settings in between or use different application threads to call parallel regions while others are still active. OpenMP在内部实现线程池,除非您在两者之间更改了某些设置或在其他线程仍处于活动状态时使用不同的应用程序线程来调用并行区域,否则它会尝试重用线程。

One can verify that the threads are indeed the same by using thread locals. 可以通过使用线程局部变量来验证线程确实相同。 I'd recommend you to verify your claim about recreating the threads. 我建议您验证有关重新创建线程的主张。 OpenMP runtime does lots of smart optimizations beyond obvious thread pool idea, you just need to know how to tune and control it properly. 除了明显的线程池概念外,OpenMP运行时还进行了许多智能优化,您只需要知道如何正确调整和控制它即可。

While it is unlikely that threads are recreated, it is easy to see how threads can go to sleep by the time when you call parallel region again and it takes noticeable amount of time to wake them up. 尽管不太可能重新创建线程,但在再次调用并行区域时很容易看到线程如何进入睡眠状态,并且唤醒它们需要花费大量时间。 You can prevent threads from going to sleep by using OMP_WAIT_POLICY=active and/or implementation-specific environment variables like KMP_BLOCKTIME=infinite (for Intel/LLVM run-times). 您可以通过使用OMP_WAIT_POLICY=active和/或特定于实现的环境变量(例如KMP_BLOCKTIME=infinite (对于Intel / LLVM运行时)来防止线程进入睡眠状态。

This is just in addition to Anton's correct answer. 这只是对Anton正确答案的补充。 If you are really concerned about the issue, for most programs you can easily move the parallel region on the outside and keep serial work serial like follows: 如果您真的很担心这个问题,那么对于大多数程序,您可以轻松地在外部移动并行区域并保持串行工作串行,如下所示:

void f(X &src, Y &dest) {
   // You can also do simple computations
   // without side effects outside of the single section
   #pragma omp single
   {
   ... // do processing based on "src"
   }
   #pragma omp for // note parallel is missing
   for (...) { 

   }
   #pragma omp critical
   ...// each thread puts its own part of the output into "dest" 
}

int main() {
    ...
    // make sure to declare loop variable locally or explicitly private
    #pragma omp parallel
    for(type variable;...;...) {
       f(...);
    }
    ...
    return 0;
}

Use this only if you have measured evidence that you are suffering from the overhead of reopening parallel regions. 仅当您测量到有证据证明正在遭受重新打开并行区域的开销时,才使用此功能。 You may have to juggle with shared variables, or manually inline f , because all variables declared within f will be private - so how it looks in detail depends on your specific application. 您可能需要处理共享变量,或者手动内联f ,因为在f声明的所有变量都是私有的-因此,其详细外观取决于您的特定应用程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM