简体   繁体   English

如何在 C++ 中使用 openmp 并行运行展开的“for”循环(tmp)?

[英]How to run unrolled 'for' loop (tmp) in parallel using openmp in c++?

Code 1 shows the parallelization of 'for' loop using openmp.代码 1 显示了使用 openmp 并行化“for”循环。 I would like to achieve similar parallelization after unrolling the 'for' loops using template metaprogramming (refer Code 2).我想在使用模板元编程展开“for”循环后实现类似的并行化(请参阅代码 2)。 Could you please help?能否请你帮忙?

Code 1: Outer for loop run in parallel with four threads代码 1:外部 for 循环与四个线程并行运行

void some_algorithm()
{
  // code
}

int main()
{
  #pragma omp parallel for
  for (int i=0; i<4; i++)
  {
    //some code
    for (int j=0;j<10;j++)
    {
      some_algorithm()
    }
  }
}

Code 2: Same as Code 1, I want to run outer for loop in parallel using openmp.代码 2:与代码 1 相同,我想使用 openmp 并行运行外部 for 循环。 How to do that?怎么做? 1 1

template <int I, int ...N>
struct Looper{
    template <typename F, typename ...X>
    constexpr void operator()(F& f, X... x) {
        for (int i = 0; i < I; ++i) {
            Looper<N...>()(f, x..., i);
        }
    }
};

template <int I>
struct Looper<I>{
    template <typename F, typename ...X>
    constexpr void operator()(F& f, X... x) {
        for (int i = 0; i < I; ++i) {
            f(x..., i);
        }
    }
};


int main()
{
    Looper<4, 10>()(some_algorithm); 
}

1 Thanks to Nim for code 2 How to generate nested loops at compile time ? 1感谢 Nim 提供代码 2 如何在编译时生成嵌套循环

If you remove the constexpr declarations, then you can use _Pragma("omp parallel for") , something like this如果删除constexpr声明,则可以使用_Pragma("omp parallel for") ,类似这样

#include <omp.h>

template <int I, int ...N>
struct Looper{
    template <typename F, typename ...X>
    void operator()(F& f, X... x) {
        _Pragma("omp parallel for if (!omp_in_parallel())")
        for (int i = 0; i < I; ++i) {
            Looper<N...>()(f, x..., i);
        }
    }
};

template <int I>
struct Looper<I>{
    template <typename F, typename ...X>
    void operator()(F& f, X... x) {
        for (int i = 0; i < I; ++i) {
            f(x..., i);
        }
    }
};

void some_algorithm(...) {
}
int main()
{
    Looper<4, 10>()(some_algorithm); 
}

Which you can see being compiled to use OpenMP at https://godbolt.org/z/nPrcWP (observe the call to GOMP_parallel ...).您可以在https://godbolt.org/z/nPrcWP看到它被编译为使用 OpenMP(观察对GOMP_parallel的调用......)。 The code also compiles with LLVM (switch the compiler to see :-)).该代码还使用 LLVM 进行编译(切换编译器以查看 :-))。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM