简体   繁体   English

标准C ++ 11是否保证memory_order_seq_cst阻止StoreLoad在原子周围重新排序非原子?

[英]Does standard C++11 guarantee that memory_order_seq_cst prevents StoreLoad reordering of non-atomic around an atomic?

Does standard C++11 guarantee that memory_order_seq_cst prevents StoreLoad reordering around an atomic operation for non-atomic memory accesses? 标准C ++ 11是否保证memory_order_seq_cst阻止StoreLoad重新排序原子操作以进行非原子内存访问?

As known, there are 6 std::memory_order s in C++11, and its specifies how regular, non-atomic memory accesses are to be ordered around an atomic operation - Working Draft, Standard for Programming Language C++ 2016-07-12: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf 众所周知,C ++ 11中有6个std::memory_order ,它指定了如何围绕原子操作对常规非原子内存访问进行排序 - 工作草案,编程语言C ++标准2016-07-12 : http//www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf

§ 29.3 Order and consistency §29.3顺序和一致性

§ 29.3 / 1 §29.3/ 1

The enumeration memory_order specifies the detailed regular (non-atomic) memory synchronization order as defined in 1.10 and may provide for operation ordering. 枚举memory_order指定1.10中定义的详细常规(非原子)内存同步顺序,并且可以提供操作排序。 Its enumerated values and their meanings are as follows: 其枚举值及其含义如下:

Also known, that these 6 memory_orders prevent some of these reordering: 众所周知,这6个memory_orders会阻止其中一些重新排序:

在此输入图像描述

But, does memory_order_seq_cst prevent StoreLoad reordering around an atomic operation for regular, non-atomic memory accesses or only for other atomic with the same memory_order_seq_cst ? 但是, memory_order_seq_cst阻止StoreLoad围绕原子操作重新排序以进行常规的非原子内存访问,或仅针对具有相同memory_order_seq_cst其他原子进行memory_order_seq_cst

Ie to prevent this StoreLoad-reordering should we use std::memory_order_seq_cst for both STORE and LOAD, or only for one of it? 即,为了防止这个StoreLoad重新排序,我们应该将std::memory_order_seq_cst用于STORE和LOAD,还是仅用于其中一个?

std::atomic<int> a, b;
b.store(1, std::memory_order_seq_cst); // Sequential Consistency
a.load(std::memory_order_seq_cst); // Sequential Consistency

About Acquire-Release semantic is all clear, it specifies exactly non-atomic memory-access reordering across atomic operations: http://en.cppreference.com/w/cpp/atomic/memory_order 关于Acquire-Release语义是明确的,它完全指定了跨原子操作的非原子内存访问重新排序: http//en.cppreference.com/w/cpp/atomic/memory_order


To prevent StoreLoad-reordering we should use std::memory_order_seq_cst . 为了防止StoreLoad重新排序,我们应该使用std::memory_order_seq_cst

Two examples: 两个例子:

  1. std::memory_order_seq_cst for both STORE and LOAD: there is MFENCE STORE和LOAD的std::memory_order_seq_cstMFENCE

StoreLoad can't be reordered - GCC 6.1.0 x86_64: https://godbolt.org/g/mVZJs0 StoreLoad无法重新排序 - GCC 6.1.0 x86_64: https ://godbolt.org/g/mVZJs0

std::atomic<int> a, b;
b.store(1, std::memory_order_seq_cst); // can't be executed after LOAD
a.load(std::memory_order_seq_cst); // can't be executed before STORE
  1. std::memory_order_seq_cst for LOAD only: there isn't MFENCE std::memory_order_seq_cststd::memory_order_seq_cst于LOAD: 没有MFENCE

StoreLoad can be reordered - GCC 6.1.0 x86_64: https://godbolt.org/g/2NLy12 StoreLoad可以重新排序 - GCC 6.1.0 x86_64: https ://godbolt.org/g/2NLy12

std::atomic<int> a, b;
b.store(1, std::memory_order_release); // can be executed after LOAD
a.load(std::memory_order_seq_cst); // can be executed before STORE

Also if C/C++-compiler used alternative mapping of C/C++11 to x86, which flushes the Store Buffer before the LOAD: MFENCE,MOV (from memory) , so we must use std::memory_order_seq_cst for LOAD too: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html As this example is discussed in another question as approach (3): Does it make any sense instruction LFENCE in processors x86/x86_64? 此外,如果C / C ++ - 编译器使用C / C ++ 11的替代映射到x86,它在LOAD之前刷新存储缓冲区: MFENCE,MOV (from memory) ,所以我们也必须使用std::memory_order_seq_cst进行LOAD: http ://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html这个例子在另一个问题中被讨论为方法(3): 在处理器x86 / x86_64中它是否有意义指令LFENCE?

Ie we should use std::memory_order_seq_cst for both STORE and LOAD to generate MFENCE guaranteed, that prevents StoreLoad reordering. 即我们应该使用std::memory_order_seq_cst进行STORE和LOAD以保证生成MFENCE ,这可以防止StoreLoad重新排序。

Is it true, that memory_order_seq_cst for atomic Load or Store: 是真的,原子加载或存储的memory_order_seq_cst

  • specifi Acquire-Release semantic - prevent: LoadLoad, LoadStore, StoreStore reordering around an atomic operation for regular, non-atomic memory accesses, specifici Acquire-Release语义 - 阻止:LoadLoad,LoadStore,StoreStore重新排序原子操作以进行常规的非原子内存访问,

  • but prevent StoreLoad reordering around an atomic operation only for other atomic operations with the same memory_order_seq_cst ? 但是阻止StoreLoad 仅针对具有相同memory_order_seq_cst 其他原子操作重新排序原子操作?

No, standard C++11 doesn't guarantee that memory_order_seq_cst prevents StoreLoad reordering of non-atomic around an atomic(seq_cst) . 不,标准C ++ 11 保证memory_order_seq_cst阻止StoreLoad重新排序non-atomic周围的non-atomic atomic(seq_cst)

Even standard C++11 doesn't guarantee that memory_order_seq_cst prevents StoreLoad reordering of atomic(non-seq_cst) around an atomic(seq_cst) . 即使是标准的C ++ 11 也不能保证memory_order_seq_cst阻止StoreLoad重新排序atomic(non-seq_cst)周围的atomic(seq_cst)

Working Draft, Standard for Programming Language C++ 2016-07-12: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf 工作草案,编程语言标准C ++ 2016-07-12: http//www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf

  • There shall be a single total order S on all memory_order_seq_cst operations - C++11 Standard: 所有memory_order_seq_cst操作都应该有一个总订单S - C ++ 11 Standard:

§ 29.3 §29.3

3 3

There shall be a single total order S on all memory_order_seq_cst operations, consistent with the “happens before” order and modification orders for all affected locations , such that each memory_order_seq_cst operation B that loads a value from an atomic object M observes one of the following values: ... 在所有memory_order_seq_cst操作上应该存在单个总订单S,与所有受影响位置的“发生之前”订单和修改订单一致 ,使得从原子对象M加载值的每个memory_order_seq_cst操作B遵守以下值之一:...

  • But, any atomic operations with ordering weaker than memory_order_seq_cst hasn't sequential consistency and hasn't single total order, ie non- memory_order_seq_cst operations can be reordered with memory_order_seq_cst operations in allowed directions - C++11 Standard: 但是,任何排序比memory_order_seq_cst弱的原子操作都没有顺序一致性,并且没有单个总顺序,即非memory_order_seq_cst操作可以在允许的方向上使用memory_order_seq_cst操作重新排序 - C ++ 11标准:

§ 29.3 §29.3

8 [ Note: memory_order_seq_cst ensures sequential consistency only for a program that is free of data races and uses exclusively memory_order_seq_cst operations. 8 [注意: memory_order_seq_cst仅针对没有数据争用且仅使用memory_order_seq_cst操作 的程序确保顺序一致性 Any use of weaker ordering will invalidate this guarantee unless extreme care is used. 除非使用极度谨慎,否则使用较弱的订购将使此保证无效 In particular, memory_order_seq_cst fences ensure a total order only for the fences themselves. 特别是,memory_order_seq_cst围栏仅确保围栏本身的总订单。 Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering specifications. 通常,Fences不能用于恢复具有较弱排序规范的原子操作的顺序一致性。 — end note ] - 结束说明]


Also C++-compilers allows such reorderings: 此外,C ++ - 编译器允许这样的重新排序:

  1. On x86_64 在x86_64上

Usually - if in compilers seq_cst implemented as barrier after store, then: 通常 - 如果在编译器中seq_cst在存储后实现为屏障,则:

STORE-C(relaxed); LOAD-B(seq_cst); can be reordered to LOAD-B(seq_cst); 可以重新排序到LOAD-B(seq_cst); STORE-C(relaxed);

Screenshot of Asm generated by GCC 7.0 x86_64: https://godbolt.org/g/4yyeby 由GCC 7.0 x86_64生成的Asm的屏幕截图: https//godbolt.org/g/4yyeby

Also, theoretically possible - if in compilers seq_cst implemented as barrier before load, then: 另外,理论上可行 - 如果在编译器中seq_cst在加载之前实现为屏障,那么:

STORE-A(seq_cst); LOAD-C(acq_rel); can be reordered to LOAD-C(acq_rel); 可以重新排序到LOAD-C(acq_rel); STORE-A(seq_cst);

  1. On PowerPC 在PowerPC上

STORE-A(seq_cst); LOAD-C(relaxed); can be reordered to LOAD-C(relaxed); 可以重新排序到LOAD-C(relaxed); STORE-A(seq_cst);

Also on PowerPC can be such reordering: 另外在PowerPC上可以进行这样的重新排序:

STORE-A(seq_cst); STORE-C(relaxed); can reordered to STORE-C(relaxed); 可以重新排序到STORE-C(relaxed); STORE-A(seq_cst);

If even atomic variables are allowed to be reordered across atomic(seq_cst), then non-atomic variables can also be reordered across atomic(seq_cst). 如果允许原子变量跨原子(seq_cst)重新排序,那么非原子变量也可以在原子(seq_cst)上重新排序。

Screenshot of Asm generated by GCC 4.8 PowerPC: https://godbolt.org/g/BTQBr8 由GCC 4.8 PowerPC生成的Asm的屏幕截图: https//godbolt.org/g/BTQBr8


More details: 更多细节:

  1. On x86_64 在x86_64上

STORE-C(release); LOAD-B(seq_cst); can be reordered to LOAD-B(seq_cst); 可以重新排序到LOAD-B(seq_cst); STORE-C(release);

Intel® 64 and IA-32 Architectures 英特尔®64和IA-32架构

8.2.3.4 Loads May Be Reordered with Earlier Stores to Different Locations 8.2.3.4载荷可以与较早的商店重新排序到不同的地点

Ie x86_64 code: 即x86_64代码:

STORE-A(seq_cst);
STORE-C(release); 
LOAD-B(seq_cst);

Can be reordered to: 可以重新排序:

STORE-A(seq_cst);
LOAD-B(seq_cst);
STORE-C(release); 

This can happen because between c.store and b.load isn't mfence : 这可能发生,因为c.storeb.load之间不是mfence

x86_64 - GCC 7.0 : https://godbolt.org/g/dRGTaO x86_64 - GCC 7.0https//godbolt.org/g/dRGTaO

C++ & asm - code: C ++&asm - 代码:

#include <atomic>

// Atomic load-store
void test() {
    std::atomic<int> a, b, c;
    a.store(2, std::memory_order_seq_cst);          // movl 2,[a]; mfence;
    c.store(4, std::memory_order_release);          // movl 4,[c];
    int tmp = b.load(std::memory_order_seq_cst);    // movl [b],[tmp];
}

It can be reordered to: 它可以重新排序为:

#include <atomic>

// Atomic load-store
void test() {
    std::atomic<int> a, b, c;
    a.store(2, std::memory_order_seq_cst);          // movl 2,[a]; mfence;
    int tmp = b.load(std::memory_order_seq_cst);    // movl [b],[tmp];
    c.store(4, std::memory_order_release);          // movl 4,[c];
}

Also, Sequential Consistency in x86/x86_64 can be implemented in four ways: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html 此外,x86 / x86_64中的顺序一致性可以通过四种方式实现: http//www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

  1. LOAD (without fence) and STORE + MFENCE LOAD (没有围栏)和STORE + MFENCE
  2. LOAD (without fence) and LOCK XCHG LOAD (没有围栏)和LOCK XCHG
  3. MFENCE + LOAD and STORE (without fence) MFENCE + LOADSTORE (没有栅栏)
  4. LOCK XADD ( 0 ) and STORE (without fence) LOCK XADD (0)和STORE (没有围栏)
  • 1 and 2 ways: LOAD and ( STORE + MFENCE )/( LOCK XCHG ) - we reviewed above 1和2种方式: LOAD和( STORE + MFENCE )/( LOCK XCHG ) - 我们在上面进行了评论
  • 3 and 4 ways: ( MFENCE + LOAD )/ LOCK XADD and STORE - allow next reordering: 3种和4种方式:( MFENCE + LOAD )/ LOCK XADDSTORE - 允许下一次重新排序:

STORE-A(seq_cst); LOAD-C(acq_rel); can be reordered to LOAD-C(acq_rel); 可以重新排序到LOAD-C(acq_rel); STORE-A(seq_cst);


  1. On PowerPC 在PowerPC上

STORE-A(seq_cst); LOAD-C(relaxed); can be reordered to LOAD-C(relaxed); 可以重新排序到LOAD-C(relaxed); STORE-A(seq_cst);

Allows Store-Load reordering ( Table 5 - PowerPC ): http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf 允许存储负载重新排序( 表5 - PowerPC ): http//www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf

Stores Reordered After Loads 加载后重新排序的商店

Ie PowerPC code: 即PowerPC代码:

STORE-A(seq_cst);
STORE-C(relaxed); 
LOAD-C(relaxed); 
LOAD-B(seq_cst);

Can be reordered to: 可以重新排序:

LOAD-C(relaxed);
STORE-A(seq_cst);
STORE-C(relaxed); 
LOAD-B(seq_cst);

PowerPC - GCC 4.8 : https://godbolt.org/g/xowFD3 PowerPC - GCC 4.8https//godbolt.org/g/xowFD3

C++ & asm - code: C ++&asm - 代码:

#include <atomic>

// Atomic load-store
void test() {
    std::atomic<int> a, b, c;       // addr: 20, 24, 28
    a.store(2, std::memory_order_seq_cst);          // li r9<-2; sync; stw r9->[a];
    c.store(4, std::memory_order_relaxed);          // li r9<-4; stw r9->[c];
    c.load(std::memory_order_relaxed);              // lwz r9<-[c];
    int tmp = b.load(std::memory_order_seq_cst);    // sync; lwz r9<-[b]; ... isync;
}

By dividing a.store into two parts - it can be reordered to: 通过将a.store分成两部分 - 它可以重新排序为:

#include <atomic>

// Atomic load-store
void test() {
    std::atomic<int> a, b, c;       // addr: 20, 24, 28
    //a.store(2, std::memory_order_seq_cst);            // part-1: li r9<-2; sync;
    c.load(std::memory_order_relaxed);              // lwz r9<-[c];
    a.store(2, std::memory_order_seq_cst);          // part-2: stw r9->[a];
    c.store(4, std::memory_order_relaxed);          // li r9<-4; stw r9->[c];
    int tmp = b.load(std::memory_order_seq_cst);    // sync; lwz r9<-[b]; ... isync;
}

Where load-from-memory lwz r9<-[c]; 从内存加载lwz r9<-[c]; executed earlier than store-to-memory stw r9->[a]; 比存储到内存stw r9->[a];更早执行stw r9->[a]; .


Also on PowerPC can be such reordering: 另外在PowerPC上可以进行这样的重新排序:

STORE-A(seq_cst); STORE-C(relaxed); can reordered to STORE-C(relaxed); 可以重新排序到STORE-C(relaxed); STORE-A(seq_cst);

Because PowerPC has weak memory ordering model - allows Store-Store reordering ( Table 5 - PowerPC ): http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf 因为PowerPC具有弱内存排序模型 - 允许Store-Store重新排序( 表5 - PowerPC ): http//www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf

Stores Reordered After Stores 商店后重新订购的商店

Ie on PowerPC operations Store can be reordered with other Store, then previous example can be reordered such as: 即在PowerPC上运行Store可以与其他Store重新排序,然后可以重新排序前面的示例,例如:

#include <atomic>

// Atomic load-store
void test() {
    std::atomic<int> a, b, c;       // addr: 20, 24, 28
    //a.store(2, std::memory_order_seq_cst);            // part-1: li r9<-2; sync;
    c.load(std::memory_order_relaxed);              // lwz r9<-[c];
    c.store(4, std::memory_order_relaxed);          // li r9<-4; stw r9->[c];
    a.store(2, std::memory_order_seq_cst);          // part-2: stw r9->[a];
    int tmp = b.load(std::memory_order_seq_cst);    // sync; lwz r9<-[b]; ... isync;
}

Where store-to-memory stw r9->[c]; store-to-memory stw r9->[c]; executed earlier than store-to-memory stw r9->[a]; 比存储到内存stw r9->[a];更早执行stw r9->[a]; .

The std::memory_order_seq_cst guarantees there is no reordering by either compiler nor cpu. std::memory_order_seq_cst保证编译器和cpu都没有重新排序。 In this case the same memory order as if only one instruction where executed at a time. 在这种情况下,相同的内存顺序就好像每次只执行一条指令一样。

But the compiler optimization confuses the issues, if you turn off -O3 then the fence is there . 但是,编译器优化混淆了问题,如果关闭-O3那么栅栏那里

The compiler can see that in your test program with -O3 that there are no consequence of the mfence as the program is too simple. 编译器可以在你的测试程序中看到-O3没有mfence的后果,因为程序太简单了。

If you ran it on an Arm on the other hand like this you can see the barriers dmb ish . 如果你运行它在手臂上,另一方面像这样你可以看到障碍dmb ish

So if your program is more complex you might see the mfence in this part of the code but not if the compiler can analyse and reason that it is not needed. 因此,如果您的程序更复杂,您可能会在代码的这一部分看到mfence ,但如果编译器可以分析并mfence它不需要则不会。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 memory_order_seq_cst 如何与非原子操作同步? - How does memory_order_seq_cst synchronize with non-atomic operations? c ++ 11原子排序:锁的扩展总命令memory_order_seq_cst - c++11 atomic ordering: extended total order memory_order_seq_cst for locks C ++ 11使用非原子变量的原子内存顺序 - C++11 Atomic memory order with non-atomic variables atomic_thread_fence(memory_order_seq_cst) 是否具有完整内存屏障的语义? - Does atomic_thread_fence(memory_order_seq_cst) have the semantics of a full memory barrier? 标记为 std::memory_order_seq_cst 的单个原子操作是否会在任何地方触发顺序一致性? - Does a single atomic operation marked as std::memory_order_seq_cst trigger sequential consistency everywhere? 在这种情况下,带有 memory_order_seq_cst 的原子读操作读取哪个值? - Which value does atomic read operation with memory_order_seq_cst read in this situation? C++11 memory_model_relaxed 和 memory_order_seq_cst 关系 - C++11 memory_model_relaxed and memory_order_seq_cst relation 在C11 / C ++ 11中,可以在同一个内存中混合原子/非原子操作吗? - In C11/C++11, possible to mix atomic/non-atomic ops on the same memory? 在C ++ 11和OpenMP中以原子方式访问非原子内存位置? - Atomic access to non-atomic memory location in C++11 and OpenMP? 与 `std::mutex` 同步是否比与 `std::atomic(memory_order_seq_cst)` 同步慢? - Is synchronizing with `std::mutex` slower than with `std::atomic(memory_order_seq_cst)`?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM