简体   繁体   English

OpenMP中预先分配的私有std :: vector并行化为C ++中的循环

[英]Pre-allocated private std::vector in OpenMP parallelized for loop in C++

I intend to use buffers std::vector<size_t> buffer(100) , one in each thread in a parallelization of a loop, as suggested by this code: 我打算使用缓冲区std::vector<size_t> buffer(100) ,在一个循环的并行化中每个线程中有一个,如此代码所示:

std::vector<size_t> buffer(100);
#pragma omp parallel for private(buffer)
for(size_t j = 0; j < 10000; ++j) {
    // ... code using the buffer ...
}

This code does not work. 此代码不起作用。 Although there is a buffer for every thread, those can have size 0. 虽然每个线程都有一个缓冲区,但它们的大小为0。

How can I allocate the buffer in the beginning of each thread? 如何在每个线程的开头分配缓冲区? Can I still use #pragma omp parallel for ? 我还可以使用#pragma omp parallel for吗? And can I do it more elegantly than this: 我可以比这更优雅地做到这一点:

std::vector<size_t> buffer;
#pragma omp parallel for private(buffer)
for(size_t j = 0; j < 10000; ++j) {
    if(buffer.size() != 100) {
        #pragma omp critical
        buffer.resize(100);
    }
    // ... code using the buffer ...
}

Split the OpenMP region as shown in this question . 拆分OpenMP区域, 如此问题所示。

Then declare the vector inside the outer-region, but outside the for-loop itself. 然后在外部区域内声明向量,但在for循环本身之外。 This will make one local vector for each thread. 这将为每个线程创建一个本地向量。

#pragma omp parallel
{
    std::vector<size_t> buffer(100);

#pragma omp for
    for(size_t j = 0; j < 10000; ++j) {
    {

        // ... code using the buffer ...

    }
}

The question and the accepted answer have been around for a while, here are some further information which provide additional insight into openMP and therefore might be helpful to other users. 问题和接受的答案已经存在了一段时间,这里有一些进一步的信息,提供了对openMP的额外见解,因此可能对其他用户有所帮助。

In C++, the private and firstprivate clause handle class objects differently: 在C ++中, privatefirstprivate子句以不同方式处理类对象:

From the OpenMP Application Program Interface v3.1: 从OpenMP应用程序接口v3.1:

private : the new list item is initialized, or has an undefined initial value, as if it had been locally declared without an initializer. private :新列表项已初始化,或者具有未定义的初始值,就好像它是在没有初始值设定项的情况下本地声明的那样。 The order in which any default constructors for different private variables of class type are called is unspecified. 调用类型的不同私有变量的任何默认构造函数的顺序是未指定的。

firstprivate : for variables of class type, a copy constructor is invoked to perform the initialization of list variables. firstprivate :对于类类型的变量,调用复制构造函数来执行列表变量的初始化。

ie private calls the default constructor, whereas firstprivate calls the copy constructor of the corresponding class. private调用默认构造函数,而firstprivate调用相应类的复制构造函数。

The default constructor of std::vector constructs an empty container with no elements, this is why the buffers have size 0. std::vector的默认构造函数构造一个没有元素的空容器,这就是缓冲区大小为0的原因。

To answer the question, this would be an other solution with no need to split the OpenMP region: 要回答这个问题,这将是另一个解决方案,无需拆分OpenMP区域:

std::vector<size_t> buffer(100, 0);  
#pragma omp parallel for firstprivate(buffer)
for (size_t j = 0; j < 10000; ++j) {
  // use the buffer
}

EDIT a word of caution regarding private variables in general: the thread stack size is limited and unless explicitly set (environment variable OMP_STACKSIZE ) compiler dependent. 编辑一般关于私有变量的警告:线程堆栈大小是有限的,除非显式设置(环境变量OMP_STACKSIZE )编译器相关。 If you use private variables with a large memory footprint, stack overflow may become an issue. 如果使用具有大内存占用量的私有变量,则堆栈溢出可能会成为问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM