简体   繁体   English

OpenMP原子和非原子读/写在x86_64上产生相同的指令

[英]OpenMP atomic and non-atomic reads/writes produce the same instructions on x86_64

According to the OpenMP Specification (v4.0), the following program contains a possible data race due to unsynchronized read/write of i : 根据OpenMP规范(v4.0),由于i不同步读/写,以下程序包含可能的数据争用:

int i{0}; // std::atomic<int> i{0};

void write() {
// #pragma omp atomic write // seq_cst
   i = 1;
}

int read() {
   int j;
// #pragma omp atomic read // seq_cst
   j = i; 
   return j;
}

int main() {
   #pragma omp parallel
   { /* code that calls both write() and read() */ }
}

Possible solutions that came to my mind are shown in the code as comments: 我想到的可能解决方案在代码中显示为注释:

  1. to protect write and read of i with #pragma omp atomic write/read , #pragma omp atomic write/read来保护i #pragma omp atomic write/read
  2. to protect write and read of i with #pragma omp atomic write/read seq_cst , 使用#pragma omp atomic write/read seq_cst来保护i写入和读取,
  3. to use std::atomic<int> instead of int as a type of i . 使用std::atomic<int>而不是int作为i的类型。

Here are the compilers-generated instructions on x86_64 (with -O2 in all cases): 以下是x86_64上编译器生成的指令(在所有情况下都为-O2 ):

GNU g++ 4.9.2:               i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          MOV           MOV
// #pragma omp atomic seq_cst:  MOV           MOV
#pragma omp atomic seq_cst:  MOV+MFENCE    MOV    (see UPDATE)
std::atomic<int>:            MOV+MFENCE    MOV

clang++ 3.5.0:               i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          MOV           MOV
#pragma omp atomic seq_cst:  MOV           MOV
std::atomic<int>:            XCHG          MOV

Intel icpc 16.0.1:           i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          *             *
#pragma omp atomic seq_cst:  *             *
std::atomic<int>:            XCHG          MOV

* Multiple instructions with calls to __kmpc_atomic_xxx functions.

What I wonder is why the GNU/clang compiler does not generate any special instructions for #pragma omp atomic writes. 我想知道为什么GNU / clang编译器不会为#pragma omp atomic写入生成任何特殊指令。 I would expect similar instructions as for std::atomic , ie, either MOV+MFENCE or XCHG . 我期望与std::atomic类似的指令,即MOV+MFENCEXCHG Any explanation? 任何解释?

UPDATE UPDATE

g++ 5.3.0 produces MFENCE for #pragma omp atomic write seq_cst . 克++ 5.3.0产生MFENCE#pragma omp atomic write seq_cst That is the correct behavior, I believe. 我相信这是正确的行为。 Without seq_cst , it produces plain MOV , which is sufficient for non-SC atomicity. 如果没有seq_cst ,它会产生纯MOV ,这对于非SC原子性来说就足够了。

There was a bug in my Makefile, g++ 4.9.2 produces MFENCE for CS atomic write as well. 我的Makefile中有一个错误,g ++ 4.9.2 MFENCE为CS原子写入产生了MFENCE Sorry guys for that. 对不起,伙计们。

Clang 3.5.0 does not implement the OpenMP SC atomics, thanks Hristo Iliev for pointing this out. Clang 3.5.0没有实现OpenMP SC原子,感谢Hristo Iliev指出这一点。

There are two possibilities. 有两种可能性。

  1. The compiler is not obligated to convert C++ code containing a data race into bad machine code. 编译器没有义务将包含数据争用的C ++代码转换为错误的机器代码。 Depending on the machine memory model, the instructions normally used may already be atomic and coherent. 根据机器内存模型,通常使用的指令可能已经是原子的和连贯的。 Take that same C++ code to another architecture and you may start seeing the pragmas cause differences that didn't exist on x86_64. 将相同的C ++代码带到另一个体系结构中,您可能会开始看到pragma导致x86_64上不存在的差异。

  2. In addition to potentially causing use of different instructions and/or extra memory fence instructions, the atomic pragmas (as well std::atomic and volatile ) also constrain the compiler's own code reordering optimizations. 除了可能导致使用不同的指令和/或额外的内存栅栏指令之外,原子编译指示(以及std::atomicvolatile )也会限制编译器自己的代码重新排序优化。 They may not apply to your simply case, but you certainly could see that common-subexpression elimination, including hoisting computations outside a loop, may be affected. 它们可能不适用于您的简单情况,但您当然可以看到共同子表达式消除,包括在循环外提升计算,可能会受到影响。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM