简体繁体中英

C++ Memory Barriers for Atomics

原文 2012-01-12 20:22:55 5 2 c++/ windows/ visual-c++/ memory-barriers

I'm a newbie when it comes to this. Could anyone provide a simplified explanation of the differences between the following memory barriers?

The windows MemoryBarrier();
The fence _mm_mfence();
The inline assembly asm volatile ("" : : : "memory");
The intrinsic _ReadWriteBarrier();

If there isn't a simple explanation some links to good articles or books would probably help me get it straight. Until now I was fine with just using objects written by others wrapping these calls but I'd like to have a better understanding than my current thinking which is basically along the lines of there is more than one way to implement memory barriers under the covers.

2 answers

Both MemoryBarrier (MSVC) and _mm_mfence (supported by several compilers) provide a hardware memory fence, which prevents the processor from moving reads and writes across the fence.

The main difference is that MemoryBarrier has platform specific implementations for x86, x64 and IA64, where as _mm_mfence specifically uses the mfence SSE2 instruction, so it's not always available.

On x86 and x64 MemoryBarrier is implemented with a xchg and lock or respectively, and I have seen some claims that this is faster than mfence. However my own benchmarks show the opposite, so apparently it's very much dependent on processor model.

Another difference is that mfence can also be used for ordering non-temporal stores/loads ( movntq etc).

GCC also has __sync_synchronize which generates a hardware fence.

asm volatile ("" : : : "memory") in GCC and _ReadWriteBarrier in MSVC only provide a compiler level memory fence, preventing the compiler from reordering memory accesses. That means the processor is still free to do reordering.

Compiler fences are generally used in combination with operations that have some kind of implicit hardware fence. Eg on x86/x64 all stores have a release fence and loads have an acquire fence, so you just need a compiler fence when implementing load-acquire and store-release.

See my answer here on the hardware level semantics of fences. What is not mentioned there is that they also prevent reordering of loads, stores or loads & stores(depending on the fence) across fences, at both compiler level and hardware level.

Do we ever need memory barriers with C++ atomics on Intel x86?

C++ atomics and memory_order with RDMA

Spin locked stack and memory barriers (C++)

Are there any implicit memory barriers in C++

What are examples of memory barriers in C++?

C++ amp atomics

C++ atomics memory ordering for some specific use case

Are C++ atomics preemption safe?

Adding two atomics in C++

how to use c++ atomics

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Do we ever need memory barriers with C++ atomics on Intel x86? C++ atomics and memory_order with RDMA Spin locked stack and memory barriers (C++) Are there any implicit memory barriers in C++ What are examples of memory barriers in C++? C++ amp atomics C++ atomics memory ordering for some specific use case Are C++ atomics preemption safe? Adding two atomics in C++ how to use c++ atomics

Related Tags

C++ Memory Barriers for Atomics

Question

2 answers

solution1
29 ACCPTED 2012-01-13 03:23:54

solution2
3 2012-01-13 07:30:41

C++ Memory Barriers for Atomics

Question

2 answers

solution1 29 ACCPTED 2012-01-13 03:23:54

solution2 3 2012-01-13 07:30:41

solution1
29 ACCPTED 2012-01-13 03:23:54

solution2
3 2012-01-13 07:30:41