OpenMP原子和非原子读/写在x86_64上产生相同的指令

Question

According to the OpenMP Specification (v4.0), the following program contains a possible data race due to unsynchronized read/write of i : 根据OpenMP规范（v4.0），由于i不同步读/写，以下程序包含可能的数据争用：

int i{0}; // std::atomic<int> i{0};

void write() {
// #pragma omp atomic write // seq_cst
   i = 1;
}

int read() {
   int j;
// #pragma omp atomic read // seq_cst
   j = i; 
   return j;
}

int main() {
   #pragma omp parallel
   { /* code that calls both write() and read() */ }
}

Possible solutions that came to my mind are shown in the code as comments: 我想到的可能解决方案在代码中显示为注释：

to protect write and read of i with #pragma omp atomic write/read , 用#pragma omp atomic write/read来保护i #pragma omp atomic write/read ，
to protect write and read of i with #pragma omp atomic write/read seq_cst , 使用#pragma omp atomic write/read seq_cst来保护i写入和读取，
to use std::atomic<int> instead of int as a type of i . 使用std::atomic<int>而不是int作为i的类型。

Here are the compilers-generated instructions on x86_64 (with -O2 in all cases): 以下是x86_64上编译器生成的指令（在所有情况下都为-O2 ）：

GNU g++ 4.9.2:               i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          MOV           MOV
// #pragma omp atomic seq_cst:  MOV           MOV
#pragma omp atomic seq_cst:  MOV+MFENCE    MOV    (see UPDATE)
std::atomic<int>:            MOV+MFENCE    MOV

clang++ 3.5.0:               i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          MOV           MOV
#pragma omp atomic seq_cst:  MOV           MOV
std::atomic<int>:            XCHG          MOV

Intel icpc 16.0.1:           i = 1;        j = i;
original code:               MOV           MOV
#pragma omp atomic:          *             *
#pragma omp atomic seq_cst:  *             *
std::atomic<int>:            XCHG          MOV

* Multiple instructions with calls to __kmpc_atomic_xxx functions.

What I wonder is why the GNU/clang compiler does not generate any special instructions for #pragma omp atomic writes. 我想知道为什么GNU / clang编译器不会为#pragma omp atomic写入生成任何特殊指令。 I would expect similar instructions as for std::atomic , ie, either MOV+MFENCE or XCHG . 我期望与std::atomic类似的指令，即MOV+MFENCE或XCHG 。 Any explanation? 任何解释？

UPDATE UPDATE

g++ 5.3.0 produces MFENCE for #pragma omp atomic write seq_cst . 克++ 5.3.0产生MFENCE为#pragma omp atomic write seq_cst 。 That is the correct behavior, I believe. 我相信这是正确的行为。 Without seq_cst , it produces plain MOV , which is sufficient for non-SC atomicity. 如果没有seq_cst ，它会产生纯MOV ，这对于非SC原子性来说就足够了。

There was a bug in my Makefile, g++ 4.9.2 produces MFENCE for CS atomic write as well. 我的Makefile中有一个错误，g ++ 4.9.2 MFENCE为CS原子写入产生了MFENCE 。 Sorry guys for that. 对不起，伙计们。

Clang 3.5.0 does not implement the OpenMP SC atomics, thanks Hristo Iliev for pointing this out. Clang 3.5.0没有实现OpenMP SC原子，感谢Hristo Iliev指出这一点。

Answer 1

There are two possibilities. 有两种可能性。

The compiler is not obligated to convert C++ code containing a data race into bad machine code. 编译器没有义务将包含数据争用的C ++代码转换为错误的机器代码。 Depending on the machine memory model, the instructions normally used may already be atomic and coherent. 根据机器内存模型，通常使用的指令可能已经是原子的和连贯的。 Take that same C++ code to another architecture and you may start seeing the pragmas cause differences that didn't exist on x86_64. 将相同的C ++代码带到另一个体系结构中，您可能会开始看到pragma导致x86_64上不存在的差异。
In addition to potentially causing use of different instructions and/or extra memory fence instructions, the atomic pragmas (as well std::atomic and volatile ) also constrain the compiler's own code reordering optimizations. 除了可能导致使用不同的指令和/或额外的内存栅栏指令之外，原子编译指示（以及std::atomic和volatile ）也会限制编译器自己的代码重新排序优化。 They may not apply to your simply case, but you certainly could see that common-subexpression elimination, including hoisting computations outside a loop, may be affected. 它们可能不适用于您的简单情况，但您当然可以看到共同子表达式消除，包括在循环外提升计算，可能会受到影响。

OpenMP原子和非原子读/写在x86_64上产生相同的指令

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-02-17 16:39:29

OpenMP原子和非原子读/写在x86_64上产生相同的指令

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-02-17 16:39:29

解决方案1
1 已采纳 2016-02-17 16:39:29