MOV x86指令是否实现了C ++ 11 memory_order_release原子存储？

Question

According to this https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html , a released store is implemented as MOV (into memory) on x86 (including x86-64). 根据此https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html ，已发布的商店在x86（包括x86-64）上实现为MOV （进入内存）。

According to his http://en.cppreference.com/w/cpp/atomic/memory_order 根据他的http://en.cppreference.com/w/cpp/atomic/memory_order

memory_order_release : memory_order_release ：

A store operation with this memory order performs the release operation: no memory accesses in the current thread can be reordered after this store. 具有此内存顺序的存储操作将执行释放操作：在此存储之后，不能对当前线程中的内存访问进行重新排序。 This ensures that all writes in the current thread are visible in other threads that acquire or the same atomic variable and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic. 这可确保当前线程中的所有写入在获取或相同原子变量的其他线程中可见，并且带有依赖关系到原子变量的写入在消耗相同原子的其他线程中变得可见。

I understand that when memory_order_release is used, all memory stores done previously should finish before this one. 我知道当使用memory_order_release时，之前完成的所有内存存储应该在此之前完成。

int a;
a = 10;
std::atomic<int> b;
b.store(50, std::memory_order_release); // i can be sure that 'a' is already 10, so processor can't reorder the stores to 'a' and 'b'

QUESTION: how is it possible that a bare MOV instruction (without an explicit memory fence) is sufficient for this behaviour? 问题：裸MOV指令（没有明确的内存栅栏）对于这种行为是否足够？ How does MOV tell the processor to finish all previous stores? MOV如何告诉处理器完成以前的所有商店？

Answer 1

There's memory reordering at run-time (done by CPU) and there's memory reordering at compile-time. 运行时有内存重新排序（由CPU完成），并且在编译时有内存重新排序。 Please read Jeff Preshing's article on compile-time reordering (and also great many other good ones on that blog) for further information. 请阅读Jeff Preshing关于编译时重新排序的文章（以及该博客上的许多其他好文章）以获取更多信息。

memory_order_release prevents the compiler from reordering access to data, as well as emitting any necessary fencing or special instructions. memory_order_release可防止编译器重新排序对数据的访问，以及发出任何必要的防护或特殊指令。 In x86 asm, ordinary loads and stores already have acquire / release semantics, so blocking compile-time reordering is sufficient for acq_rel, but not seq_cst. 在x86 asm中，普通的加载和存储已经具有获取/释放语义，因此阻塞编译时重新排序对于acq_rel而言是足够的，但不是seq_cst。

Answer 2

That does appear to be the mapping, at least in code compiled with the Intel compiler, where I see: 这似乎是映射，至少在使用英特尔编译器编译的代码中，我看到：

0000000000401100 <_Z5storeRSt6atomicIiE>:
  401100:       48 89 fa                mov    %rdi,%rdx
  401103:       b8 32 00 00 00          mov    $0x32,%eax
  401108:       89 02                   mov    %eax,(%rdx)
  40110a:       c3                      retq
  40110b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

0000000000401110 <_Z4loadRSt6atomicIiE>:
  401110:       48 89 f8                mov    %rdi,%rax
  401113:       8b 00                   mov    (%rax),%eax
  401115:       c3                      retq
  401116:       0f 1f 00                nopl   (%rax)
  401119:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

for the code: 代码：

#include <atomic>
#include <stdio.h>

void store( std::atomic<int> & b ) ;

int load( std::atomic<int> & b ) ;

int main()
{
   std::atomic<int> b ;

   store( b ) ;

   printf("%d\n", load( b ) ) ;

   return 0 ;
}

void store( std::atomic<int> & b )
{
   b.store(50, std::memory_order_release ) ;
}

int load( std::atomic<int> & b )
{
   int v = b.load( std::memory_order_acquire ) ;

   return v ;
}

The current Intel architecture documents , Volume 3 (System Programming Guide), does a nice job explaining this. 当前的英特尔架构文档第3卷（系统编程指南）可以很好地解释这一点。 See: 看到：

8.2.2 Memory Ordering in P6 and More Recent Processor Families 8.2.2 P6和更近期处理器系列中的存储器排序

Reads are not reordered with other reads. 读取不会与其他读取重新排序。
Writes are not reordered with older reads. 写入不会与较旧的读取重新排序。
Writes to memory are not reordered with other writes, with the following exceptions: ... 写入内存不会与其他写入重新排序，但以下情况除外：...

The full memory model is explained there. 那里解释了完整的内存模型。 I'd assume that Intel and the C++ standard folks have worked together in detail to nail down the best mapping for each of the memory order operations possible with that conforms to the memory model described in Volume 3, and plain stores and loads have been determined to be sufficient in those cases. 我假设英特尔和C ++标准人员已经详细合作，以确定每个内存顺序操作的最佳映射，这符合第3卷中描述的内存模型，并确定了普通存储和负载在这些情况下足够了。

Note that just because no special instructions are required for this ordered store on x86-64, doesn't mean that will be universally true. 请注意，仅仅因为x86-64上的此有序存储不需要特殊指令，并不意味着它将是普遍适用的。 For powerpc I'd expect to see something like a lwsync instruction along with the store, and on hpux (ia64) the compiler should be using a st4.rel instruction. 对于powerpc，我希望看到类似lwsync指令和商店的东西，而在hpux（ia64）上，编译器应该使用st4.rel指令。

MOV x86指令是否实现了C ++ 11 memory_order_release原子存储？

问题描述

2 个解决方案

解决方案1
5 2015-04-28 15:57:26

解决方案2
4 已采纳 2015-04-28 15:48:53

MOV x86指令是否实现了C ++ 11 memory_order_release原子存储？

问题描述

2 个解决方案

解决方案1 5 2015-04-28 15:57:26

解决方案2 4 已采纳 2015-04-28 15:48:53

解决方案1
5 2015-04-28 15:57:26

解决方案2
4 已采纳 2015-04-28 15:48:53