简体   繁体   English

使用OpenMP并行化功能

[英]Parallelize function using OpenMP

I'm trying to run code in parallel, but I'm confused with private/shared, etc. stuff related to openmp. 我正在尝试并行运行代码,但是我对与openmp相关的私有/共享等东西感到困惑。 I'm using c++ (msvc12 or gcc) and openmp. 我正在使用c ++(msvc12或gcc)和openmp。

The code iterates over the loop which consists of a block that should be run in parallel followed by a block that should be run when all the parallel stuff is done. 代码遍历循环,循环由一个应并行运行的块,然后是在完成所有并行处理后应运行的块组成。 It doesn't matter in which order the parallel stuff is processed. 并行处理的顺序无关紧要。 The code looks like this: 代码如下:

// some X, M, N, Y, Z are some constant values
const int processes = 4;
std::vector<double> vct(X);
std::vector<std::vector<double> > stackVct(processes, std::vector<double>(Y));
std::vector<std::vector<std::string> > files(processes, M)
for(int i=0; i < N; ++i)
{
  // parallel stuff
  for(int process = 0; process < processes; ++process)
  {
    std::vector<double> &otherVct = stackVct[process];
    const std::vector<std::string> &my_files = files[process];

    for(int file = 0; file < my_files.size(); ++file)
    { 
      // vct is read-only here, the value is not modified
      doSomeOtherStuff(otherVct, vct);

      // my_files[file] is read-only
      std::vector<double> thirdVct(Y);
      doSomeOtherStuff(my_files[file], thirdVct(Y));

      // thirdVct and vct are read-only
      doSomeOtherStuff2(thirdVct, otherVct, vct);
    }
  }
  // when all the parallel stuff is done, do this job
  // single thread stuff
  // stackVct is read-only, vct is modified
  doSingleTheadStuff(vct, stackVct)
}

If it is better for performance, "doSingleThreadSuff(...)" can be moved into the parallel loop, but it needs to be processed by a single thread. 如果对性能更好,则可以将“ doSingleThreadSuff(...)”移入并行循环,但是它需要由单个线程处理。 The order of functions in the most inner loop cannot be changed. 最内层循环中的功能顺序无法更改。

How should I declare #pragma omp stuff to make it working? 我应该如何声明#pragma omp内容才能使其正常工作? Thanks! 谢谢!

To run a for loop in parallel is just #pragma omp parallel for above the for loop statement and whatever variables are declared outside the for loop are shared by all the threads and whatever variables are declared inside the for loop are private to each thread. 要并行运行for循环,只需在for循环语句上方使用#pragma omp parallel for ,并且在for循环外声明的任何变量将由所有线程共享,而在for循环内声明的任何变量对每个线程都是私有的。

Note that if you are doing file IO in parallel you may not see much speedup (next to none if all you are doing is file IO) unless at least some of the files reside on different physical hard drives. 请注意,如果并行执行文件IO,则可能不会看到太多的加速(如果您仅做文件IO,则几乎没有加速),除非至少有一些文件位于不同的物理硬盘驱动器上。

Maybe something like this (mind you this is just a sketch, I did not verify it but you can get the idea): 也许是这样的(请注意,这只是一个草图,我没有验证它,但是您可以理解):

// some X, M, N, Y, Z are some constant values
const int processes = 4;
std::vector<double> vct(X);
std::vector<std::vector<double> > stackVct(processes, std::vector<double>(Y));
std::vector<std::vector<std::string> > files(processes, M)
for(int i=0; i < N; ++i)
{
    // parallel stuff
    #pragma omp parallel firstprivate(vct, files) shared(stackVct)
    {
        #pragma omp for
        for(int process = 0; process < processes; ++process)
        {
            std::vector<double> &otherVct = stackVct[process];
            const std::vector<std::string> &my_files = files[process];

            for(int file = 0; file < my_files.size(); ++file)
            {
                // vct is read-only here, the value is not modified
                doSomeOtherStuff(otherVct, vct);

                // my_files[file] is read-only
                std::vector<double> thirdVct(Y);
                doSomeOtherStuff(my_files[file], thirdVct(Y));

                // thirdVct and vct are read-only
                doSomeOtherStuff2(thirdVct, otherVct, vct);
            }
        }
        // when all the parallel stuff is done, do this job
        // single thread stuff
        // stackVct is read-only, vct is modified
        #pragma omp single nowait
        doSingleTheadStuff(vct, stackVct)
    }
}
  • I marked vct and files as first private because they are read only and I assumed they should not be modified, so each thread will get a copy of these variables for itself. 我将vctfiles标记为第一个私有files ,因为它们是只读的,并且我假定不应对其进行修改,因此每个线程将为其自身获取这些变量的副本。
  • The stackVct is marked as shared among all threads because they modify it. stackVct被标记为在所有线程之间共享,因为它们会对其进行修改。
  • Finally only one thread will execute the doSingleTheadStuff function without forcing other threads to wait. 最后,只有一个线程将执行doSingleTheadStuff函数,而不会强制其他线程等待。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM