简体   繁体   中英

C++11 atomics: does it make sense, or is it even possible, to use them with memory mapped I/O?

As I understand it, C volatile and optionally inline asm for memory fence have been used for implementing a device driver on top of memory mapped I/O. Several examples can be found in Linux kernel.

If we forget about the risk of uncaught exceptions (if any,) does it make sense to replace them with C++11 atomics? Or, is it possible at all?

As I understand for reading references std::atomic is designed to manage multi-threading access to memory (concurrency, and so on). But as I know, as well as you said, volatile is designed for things like memory mapped I/O and signal handling. So, volatile has no effect on atomic access and not resolve multi-threading access issues like atomics if used alone. And vice versa - atomics do not provide features of volatile .

Thus, the short answer to your question is NO.

In general, you can replace memory fences with atomics, but not volatile , except where it is used together with a fence exclusively for inter thread communication.

Whith regard to memory mapped I/O the reason atomics don't suffice is that:

  • volatile guarantees you that all memory accesses to that variable in your program do actuall happen and that they happen (whithin a single thread) exactly in the order you specify.
  • std::atomic only guarantees that your program will behave as if all those memory accesses happen (according to C++'s memory model, which doesn't know about memory mapped I/O) and - depending on the specified memory ordering - as if they happen in the specified order.

In practical terms that means, that the compiler can eg replace consecutive writes to the same (non-volatile) atomic with a single write (if there is no other synchronization in between) and the same is true for reads. If the result of the read is not used, it could even eliminate the read completely (the compiler might still have to issue a memory barrier though).

On a more theoretical level, if your compiler can prove that all your program does is returning 42, then it is allowed to transform this into a single instruction independently of how many threads and atomics your program uses in the process. If your program uses volatile variables that is not the case.

EDIT: Eg This paper shows a few posssible (and probably unexpected) optimizations the compiler is allowed to apply to an atomic loop variable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM