简体   繁体   English

C ++:OpenMP共享内存保护

[英]C++: OpenMP shared memory protection

If I use a shared variable, let's say a double, to calculate some kind of sum along the execution of the program. 如果我使用共享变量,让我们说一个double,来计算程序执行时的某种总和。 Would that in anyway be vulnerable to non-stable operations? 无论如何,这是否容易受到非稳定运营的影响? I mean, would it be possible that more than one core would access this variable in an asynchronous way and cause unstable results? 我的意思是,多个核心是否有可能以异步方式访问此变量并导致不稳定的结果?

For example: this is a global variable: 例如:这是一个全局变量:

double totalTime = 0;

and in each core a command is called: 并在每个核心中调用一个命令:

totalTime += elapsedTime;

This last operation/statement is executed by taking the value of totalTime, put it the the CPU register, and then do the addition. 最后一个操作/语句是通过获取totalTime的值,将其作为CPU寄存器,然后执行添加来执行的。 I can imagine that more than one core would take the same value at the same instant, and then add the new elapsedTime, and then the value stored in totalTime would be overwritten with the wrong value, due to latency. 我可以想象,不止一个核心会在同一时刻获取相同的值,然后添加新的elapsedTime,然后由于延迟,存储在totalTime中的值将被错误的值覆盖。 Is that possible? 那可能吗? and how can I solve this? 我该如何解决这个问题?

Thank you. 谢谢。

Clearly this operation is not thread-safe since, as you mentioned yourself, it involves several assembler instructions. 显然,此操作不是线程安全的,因为正如您自己提到的,它涉及多个汇编程序指令。 In fact, openMP even has a special directive for this kind of operations. 事实上,openMP甚至对这种操作有一个特殊的指令。

You will need the atomic pragma to make it, well, "atomic": 您将需要atomic编译指示来实现它,“原子”:

#pragma omp atomic
totalTime += elapsedTime;

Note that atomic only works when you have a single update to a memory location, like an addition, increment, etc. 请注意,只有在对内存位置进行单次更新(例如添加,增量等)时, atomic才有效。

If you have a series of instructions that need to atomic together you must use the critical directive: 如果您有一系列需要原子化的指令,则必须使用critical指令:

#pragma omp critical
{
    // atomic sequence of instructions
}

Edit : Here's a good suggestion from "snemarch": If you are repeatedly updating the global variable totalTime in a parallel loop you can consider using the reduction clause to automatize the process and also make it much more efficient: 编辑 :这是“snemarch”的一个很好的建议:如果你在一个并行循环中反复更新全局变量totalTime ,你可以考虑使用reduction子句自动化该过程并使其更有效:

double totalTime = 0;

#pragma omp parallel for reduction(+:totalTime)
for(...)
{
    ...
    totalTime += elapsedTime;
}

At the end totalTime will correctly contain the sum of the local elapsedTime values without need for explicit synchronization. 最后, totalTime将正确包含本地elapsedTime值的总和,而无需显式同步。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM