简体   繁体   中英

Is atomic_thread_fence(memory_order_release) different from using memory_order_acq_rel?

cppreference.com provides this note about std::atomic_thread_fence (emphasis mine):

atomic_thread_fence imposes stronger synchronization constraints than an atomic store operation with the same std::memory_order.

While an atomic store-release operation prevents all preceding writes from moving past the store-release, an atomic_thread_fence with memory_order_release ordering prevents all preceding writes from moving past all subsequent stores .

I understand this note to mean that std::atomic_thread_fence(std::memory_order_release) is not unidirectional, like a store-release. It's a bidirectional fence, preventing stores on either side of the fence from reordering past a store on the other side of the fence.

If I understand that correctly, this fence seems to make the same guarantees that atomic_thread_fence(memory_order_acq_rel) does. It is an "upward" fence, and a "downward" fence.

Is there a functional difference between std::atomic_thread_fence(std::memory_order_release) and std::atomic_thread_fence(std::memory_order_acq_rel) ? Or is the difference merely aesthetic, to document the purpose of the code?

A standalone fence imposes stronger ordering than an atomic operation with the same ordering constraint, but this does not change the direction in which ordering is enforced.

Bot an atomic release operation and a standalone release fence are uni-directional, but the atomic operation orders with respect to itself whereas the atomic fence imposes ordering with respect to other stores.

For example, an atomic operation with release semantics:

std::atomic<int> sync{0};

// memory operations A

sync.store(1, std::memory_order_release);

// store B

This guarantees that no memory operation part of A (loads & stores) can be (visibly) reordered with the atomic store itself. But it is uni-directional and no ordering rules apply to memory operations that are sequenced after the atomic operation; therefore, store B can still be reordered with any of the memory operations in A.

A standalone release fence changes this behavior:

// memory operations A

std::atomic_thread_fence(std::memory_order_release);

// load X

sync.store(1, std::memory_order_relaxed);

// stores B

This guarantees that no memory operation in A can be (visibly) reordered with any of the stores that are sequenced after the release fence. Here, the store to B can no longer be reordered with any of the memory operations in A, and as such, the release fence is stronger than the atomic release operation. But it also uni-directional since the load from X can still be reordered with any memory operation in A.

The difference is subtle and usually an atomic release operation is preferred over a standalone release fence.

The rules for a standalone acquire fence are similar, except that it enforces ordering in the opposite direction and operates on loads:

// loads B

sync.load(std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_acquire);

// memory operations A

No memory operation in A can be reordered with any load that is sequenced before the standalone acquire fence.

A standalone fence with std::memory_order_acq_rel ordering combines the logic for both acquire and release fences.

// memory operations A
// load A

std::atomic_thread_fence(std::memory_order_acq_rel);

// store B
//memory operations B

But this can get incredibly tricky once you realize that a store in A can still be reordered with a load in B. Acq/rel fences should probably be avoided in favor of regular atomic operations, or even better, mutexes.

cppreference.com made some mistakes in the paragraph you quoted. I have highlighted them in the following:

atomic_thread_fence imposes stronger synchronization constraints than an atomic store operation with the same std::memory_order. While an atomic store-release operation prevents all preceding writes (should be memory operations , ie including reads and writes) from moving past the store-release (the complete sentence should be: the store-release operation itself ), an atomic_thread_fence with memory_order_release ordering prevents all preceding writes (should be memory operations , ie including reads and writes) from moving past all subsequent stores.

To paraphrase it:

The release operation actually places fewer memory ordering constraints on neighboring operations than the release fence . A release operation only needs to prevent preceding memory operations from being reordered past itself, but a release fence must prevent preceding memory operations from being reordered past all subsequent writes. Because of this difference, a release operation can never take the place of a release fence.

This is quoted from here .

This is my interpretation of the intent of the following text, which I think is what was intended. Also, that interpretation is correct in term of the memory model, but still bad as it's an incomplete explanation .

While an atomic store-release operation prevents all preceding writes from moving past the store-release, an atomic_thread_fence with memory_order_release ordering prevents all preceding writes from moving past all subsequent stores.

The use of "store" vs. "writes" is intentional:

  • "store", here, means a store on an std::atomic<> object (not just a call to std::atomic<>::store , also assignment which is equivalent to .store(value) or a RMW atomic operation);
  • "write", here, means any memory write, either normal (non atomic) or atomic.

It's a bidirectional fence, preventing stores on either side of the fence from reordering past a store on the other side of the fence.

No, you missed an essential distinction, because it was only implied; expressed in an unclear, too subtle way - not good for a teaching text!

It says that a release fence is not symmetric: previous memory side effect, called "writes", are bound by following atomic store operations.

Even with that clarification, it's incomplete and so it's a bad explanation: it strongly suggests that the release fences exist just to make sure that writes (and writes only) are finished . That is not the case.

A release operation is what I call a: "I'm done there" signal. It signals that all previous memory operations are done, finished, visible . It's important to understand that not only modifications (which can be detected by looking at memory state) are ordered, everything on memory needs to be .

Many writes-up about thread primitives are defective in that way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM