简体繁体中英

Why can `asm volatile("" ::: "memory")` serve as a compiler barrier?

原文 2021-06-11 21:16:22 1 2 c/ gcc/ inline-assembly/ volatile/ memory-barriers

It is known that asm volatile (""::: "memory") can serve as a compiler barrier to prevent compiler from reordering assembly instructions across it. For example, it is mentioned in https://preshing.com/20120625/memory-ordering-at-compile-time/ , section "Explicit Compiler Barriers".

However, all the articles I can find only mention the fact that asm volatile (""::: "memory") can serve as a compiler barrier without giving a reason why the "memory" clobber can effectively form a compiler barrier. The GCC online documentation only says that all the special clobber "memory" does is tell the compiler that the assembly code may potentially perform memory reads or writes other than those specified in operands lists. But how does such a semantic cause compiler to stop any attempt to reorder memory instructions across it? I tried to answer myself but failed, so I ask here: why can asm volatile (""::: "memory") serve as a compiler barrier, based on the semantics of "memory" clobber? Please note that I am asking about "compiler barrier" (in effect at compile-time), not stronger "memory barrier" (in effect at run-time). For convenience, I excerpt the semantics of "memory" clobber in GCC online doc below:

The "memory" clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing the asm . Further, the compiler does not assume that any values read from memory before an asm remain unchanged after that asm ; it reloads them as needed. Using the "memory" clobber effectively forms a read/write memory barrier for the compiler.

2 answers

If a variable is potentially read or written, it matters what order that happens in. The point of a "memory" clobber is to make sure the reads and/or writes in an asm statement happen at the right point in the program's execution.

Any read of a C variable's value that happens in the source after an asm statement must be after the memory-clobbering asm statement in the compiler-generated assembly output for the target machine, otherwise it might be reading a value before the asm statement would have changed it.

Any read of a C var in the source before an asm statement similarly must stay sequenced before, otherwise it might incorrectly read a modified value.

Similar reasoning applies to assignments to (writes of) C variables before/after any asm statement with a "memory" clobber. Just like a function call to an "opaque" function, one who's definition the compiler can't see.

No reads or writes can reorder with the barrier in either direction, therefore no operation before the barrier can reorder with any operation after the barrier, or vice versa.

Another way to look at it: the actual machine memory contents must match the C abstract machine at that point. The compiler-generated asm has to respect that, by storing any variable values from registers to memory before the start of an asm("":::"memory") statement, and afterwards it has to assume that any registers that had copies of variable values might not be up to date anymore. So they have to be reloaded if they're needed.

This reads-everything / writes-everything assumption for the "memory" clobber is what keeps the asm statement from reordering at all at compile time wrt. all accesses, even non- volatile ones. The volatile is already implicit from being an asm() statement with no "=..." output operands, and is what stops it from being optimized away entirely (and with it the memory clobber).

Note that only potentially "reachable" C variables are affected. For example, escape analysis can still let the compiler keep a local int i in a register across a "memory" clobber, as long as the asm statement itself doesn't have the address as an input.

Just like a function call: for (int i=0;i<10;i++) {foobar("%d\n", i);} can keep the loop counter in a register, and just copy it to the 2nd arg-passing register for foobar every iteration. There's no way foobar can have a reference to i because its address hasn't been stored anywhere or passed anywhere.

(This is fine for the memory barrier use-case; no other thread could have its address either.)

Related:

How does a mutex lock and unlock functions prevents CPU reordering? - why opaque function calls work as compiler barriers.
How can I indicate that the memory *pointed* to by an inline ASM argument may be used? - cases where a "memory" clobber is needed for a non-empty asm statement (or other dummy operands to tell the asm statement which memory is read / written.)

I'll add that : memory is only a compiler directive. A speculative processor may reorder instructions. To prevent this an explicit memory barrier call is necessary. See Linux doc on memory barriers.

GCC memory barrier __sync_synchronize vs asm volatile(“”: : :“memory”)

volatile vs memory barrier for interrupts

Why do read and write barrier for x86 in glibc not use __volatile asm?

Does the inline asm compiler barrier (memory clobber) count as an external function, or as static function call?

The difference between asm, asm volatile and clobbering memory

Does WaitForSingleObject Serve as a Memory Barrier?

Is a memory barrier AND volatile ENOUGH to avoid a data race?

Is there a compiler memory barrier for a single variable?

Is `__asm nop` the Windows equivalent of `asm volatile(“nop”);` from GCC compiler

What is the life-time of asm volatile(“” ::: “memory”)?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question GCC memory barrier __sync_synchronize vs asm volatile(“”: : :“memory”) volatile vs memory barrier for interrupts Why do read and write barrier for x86 in glibc not use __volatile asm? Does the inline asm compiler barrier (memory clobber) count as an external function, or as static function call? The difference between asm, asm volatile and clobbering memory Does WaitForSingleObject Serve as a Memory Barrier? Is a memory barrier AND volatile ENOUGH to avoid a data race? Is there a compiler memory barrier for a single variable? Is `__asm nop` the Windows equivalent of `asm volatile(“nop”);` from GCC compiler What is the life-time of asm volatile(“” ::: “memory”)?

Related Tags

Why can `asm volatile("" ::: "memory")` serve as a compiler barrier?

Question

2 answers

solution1
1 ACCPTED 2021-06-11 23:31:39

solution2
0 2022-09-16 14:27:15

Why can `asm volatile("" ::: "memory")` serve as a compiler barrier?

Question

2 answers

solution1 1 ACCPTED 2021-06-11 23:31:39

solution2 0 2022-09-16 14:27:15

solution1
1 ACCPTED 2021-06-11 23:31:39

solution2
0 2022-09-16 14:27:15