简体   繁体   English

openMP-需要原子或归约子句

[英]openMP - need for atomic or reduction clauses

I am using openMP to parallelize a few statements. 我正在使用openMP并行化一些语句。 I am using the parallel for construct. 我正在使用并行进行构造。 The parallelized for loop looks like: 并行化的for循环如下所示:

double solverFunction::apply(double* parameters_ , Model* varModel_)  
{
     double functionEvaluation = 0;
     Command* command_ = 0;
     Model* model_ = 0;

     #pragma omp parallel for  shared (functionEvaluation) private (model_,command_)
     for (int i=rowStart;i<rowEnd+1;i++)
     {
         model_ = new Model(varModel_);
         model_->addVariable("i", i);
         model_->addVariable("j", 1);
         command_ = formulaCommand->duplicate(model_);
         functionEvaluation += command_->execute().toDouble();
     }
}

It is workly on average. 平均来说,它是可行的 Execution time is dramatically reduced, and result is as expected . 执行时间大大减少,结果与预期的一样 However, from time to time, especially for big problems (big number of iterations over i, big number of data to copy in copy constructor call 但是,有时,尤其是对于大问题(i上的大量迭代,需要在复制构造函数中复制的大量数据)

 model_ = new Model(varModel_);

, others?), it crashed. ,其他?),它崩溃了。 Call stack ends in classes such as qAtomicBasic (it is a program written in C++/Qt), QHash, and I have an idea it crashes because of concurrent read/write access in memory. 调用栈以诸如qAtomicBasic(它是用C ++ / Qt编写的程序),QHash之类的类结束的,我有一个想法,即由于内存中同时进行读/写访问,它会崩溃。

HOWEVER, model_ and command_ are private, so that each thread deals with a copy of each. 但是,model_和command_是私有的,因此每个线程都处理每个副本。 In the variable model_, I copy varModel_, so that the pointer passed in argument is not altered by the threads. 在变量model_中,我复制varModel_,以便传入参数的指针不会被线程更改。 Alike, command_ is a copy of the member variable formulaCommand (duplicate is roughtly a copy constructer). 类似地,command_是成员变量FormulaCommand的副本(重复的副本通常是副本构造函数)。

The possible flaws in my code I identified are 我确定的代码中可能存在的缺陷是

  • functionEvaluation may be modified by several threads simultaneously functionEvaluation可以同时由多个线程修改

  • copy constructor in statement 在语句中复制构造函数

    model_ = new Model(varModel_); model_ =新模型(varModel_);

reads the members for varModel_ in memory to construct the new (model_) instance. 读取内存中varModel_的成员以构造新的(model_)实例。 Concurrent access to varModel_ data members could occur, althought this not about altering their value here, but only reading them (affecting them to other variables). 可以同时访问varModel_数据成员,尽管这与更改其值无关,而只是读取它们(影响它们到其他变量)。

Also, I see two improvements only (which I cannot test until a few days, but I ask for advice anyway): 此外,我仅看到两项改进(我几天后才能测试,但无论如何我都寻求建议):

  • add atomic clause, so that functionEvalution is not concurrently written in 添加原子子句,以便不会同时写入functionEvalution

  • add operator reduction(+,functionEvaluation), so that concurrency regarding access to functionEvaluation is dealt with automatically 添加运算符减少(+,functionEvaluation),以便自动处理与访问functionEvaluation有关的并发

Do these solutions seem to accuratly solve the problem and which is more efficient in general? 这些解决方案似乎正确地解决了问题,并且总体上更有效吗? Where does the problem can lie with the code I wrote? 问题出在哪里与我编写的代码有关? What are solutions? 有什么解决方案?

Thanks a lot! 非常感谢!

The first observation is that, as you've noticed yourself, modifying functionEvaluation concurrently is a bad idea. 最初的观察是,正如您已经注意到的那样,同时修改functionEvaluation是一个坏主意。 It will fail. 失败。

The read-only access of varModel_ , on the other hand, is not a problem. 另一方面,对varModel_的只读访问不是问题。 Neither is the copy constructor call (but where is it? Your code doesn't show it). 复制构造函数调用也不是(但是它在哪里?您的代码没有显示它)。

Unrelatedly, using the private clause in C++ is a bad idea. 无关地,在C ++中使用private子句是一个坏主意。 Just declare the thread-private variables inside the parallel block (in this case, the for loop). 只需在并行块声明线程专用变量(在本例中for循环)。

I also don't see why you are using pointers here. 我也看不到为什么在这里使用指针。 Their use doesn't make immediate sense – use stack-allocated objects instead. 它们的使用没有直接意义-而是使用堆栈分配的对象。

The following modified code should work (I've also taken the liberty of unifying the coding style … why the trailing underscores?): 下面的修改后的代码应该可以工作(我还自由地统一了编码风格……为什么在末尾加下划线?):

double solverFunction::apply(double parameters, Model const& varModel)
{
     double result = 0;

     #pragma omp parallel for reduction(+:result)
     for (int i = rowStart; i < rowEnd + 1; ++i)
     {
         Model model(varModel);
         mode.addVariable("i", i);
         mode.addVariable("j", i);
         Command command = formulaCommand->duplicate(model);
         result += command.execute().toDouble();
     }

     return result;
}

Note that, due to inherent floating point inaccuracies, this code may yield different results from the sequential code. 请注意,由于固有的浮点错误,此代码可能会产生与顺序代码不同的结果。 This is unavoidable. 这是不可避免的。

Concurrently modifying functionEvaluation is definitely a problem in your code and the best way to deal with it is the reduction clause. 同时修改functionEvaluation绝对是代码中的问题,处理它的最佳方法是reduction子句。

A further problem is that fact that you are allocating heap memory by calling new in parallel, which is never a good idea for many iterations since there is a system-wide lock on calls to new . 另一个问题是您通过并行调用new来分配堆内存,对于许多迭代而言,这绝不是一个好主意,因为在系统范围内,对new调用都处于锁定状态。 Consider switching to stack allocations, since the stacks are private to each thread, while the heap is shared. 考虑切换到栈分配,因为栈是每个线程专用的,而堆是共享的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM