简体   繁体   English

在Linux内核中读写原子操作实现

[英]Read and Write atomic operation implementation in the Linux Kernel

Recently I've peeked into the Linux kernel implementation of an atomic read and write and a few questions came up. 最近,我已经深入研究了原子读写的Linux内核实现,并提出了一些问题。

First the relevant code from the ia64 architecture: 首先是ia64架构的相关代码:

typedef struct {
    int counter;
} atomic_t;

#define atomic_read(v)      (*(volatile int *)&(v)->counter)
#define atomic64_read(v)    (*(volatile long *)&(v)->counter)

#define atomic_set(v,i)     (((v)->counter) = (i))
#define atomic64_set(v,i)   (((v)->counter) = (i))
  1. For both read and write operations, it seems that the direct approach was taken to read from or write to the variable. 对于读取和写入操作,似乎采用直接方法来读取或写入变量。 Unless there is another trick somewhere, I do not understand what guarantees exist that this operation will be atomic in the assembly domain. 除非在某处有另一个技巧,否则我不明白这个操作在汇编域中是否具有原子性的保证。 I guess an obvious answer will be that such an operation translates to one assembly opcode, but even so, how is that guaranteed when taking into account the different memory cache levels (or other optimizations)? 我想一个明显的答案是,这样的操作转换为一个程序集操作码,但即便如此,当考虑到不同的内存缓存级别(或其他优化)时,如何保证?

  2. On the read macros, the volatile type is used in a casting trick. 在读取宏上,volatile类型用于转换技巧。 Anyone has a clue how this affects the atomicity here? 任何人都知道这会如何影响这里的原子性? (Note that it is not used in the write operation) (注意,它不用于写操作)

I think you are misunderstanding the (very much vague) usage of the word "atomic" and "volatile" here. 我认为你误解了“非常模糊”的“原子”和“易变”这个词。 Atomic only really means that the words will be read or written atomically (in one step, and guaranteeing that the contents of this memory position will always be one write or the other, and not something in between). Atomic只是意味着单词将以原子方式读取或写入(一步完成,并保证此存储器位置的内容始终为一个写入或另一个,而不是介于两者之间)。 And the volatile keyword tells the compiler to never assume the data in that location due to an earlier read/write (basically, never optimize away the read). volatile关键字告诉编译器由于较早的读/写(从根本上说,永远不会优化掉读取),从不假设该位置的数据。

What the words "atomic" and "volatile" do NOT mean here is that there's any form of memory synchronization. “原子”和“易变”这两个词在这里并不意味着存在任何形式的内存同步。 Neither implies ANY read/write barriers or fences. 既不暗示任何读/写障碍或围栏。 Nothing is guaranteed with regards to memory and cache coherence. 关于内存和缓存一致性,没有任何保证。 These functions are basically atomic only at the software level, and the hardware can optimize/lie however it deems fit. 这些功能在软件级别基本上只是原子功能,硬件可以根据其认为合适的优化/替代。

Now as to why simply reading is enough: the memory models for each architecture are different. 现在,为什么简单的阅读就足够了:每个架构的内存模型都不同。 Many architectures can guarantee atomic reads or writes for data aligned to a certain byte offset, or x words in length, etc. and vary from CPU to CPU. 许多体系结构可以保证对与某个字节偏移量对齐的数据进行原子读取或写入,或者长度为x个字等,并且因CPU而异。 The Linux kernel contains many defines for the different architectures that let it do without any atomic calls ( CMPXCHG , basically) on platforms that guarantee (sometimes even only in practice even if in reality their spec says the don't actually guarantee) atomic reads/writes. Linux内核包含许多不同体系结构的定义,这些体系允许它在没有任何原子调用的情况下(基本上是CMPXCHG )保证(有时甚至仅在实践中,即使实际上他们的规范实际上并不保证)原子读取/写道。

As for the volatile , while there is no need for it in general unless you're accessing memory-mapped IO, it all depends on when/where/why the atomic_read and atomic_write macros are being called. 至于volatile ,虽然除非你访问内存映射IO,否则一般不需要它,这一切都取决于调用atomic_readatomic_write宏的时间/地点/原因。 Many compilers will (though it is not set in the C spec) generate memory barriers/fences for volatile variables (GCC, off the top of my head, is one. MSVC does for sure.). 许多编译器 (虽然它不是在C规格设定)产生volatile变量的内存屏障/栅栏(GCC,把我的头顶部,是一个。MSVC不肯定的。)。 While this would normally mean that all reads/writes to this variable are now officially exempt from just about any compiler optimizations, in this case by creating a "virtual" volatile variable only this particular instance of a read/write is off-limits for optimization and re-ordering. 虽然这通常意味着对此变量的所有读/写现在都可以正式免除任何编译器优化,在这种情况下,通过创建“虚拟”volatile变量,只有这个特定的读/写实例才是优化的优化并重新订购。

The reads are atomic on most major architectures, so long as they are aligned to a multiple of their size (and aren't bigger than the read size of a give type), see the Intel Architecture manuals. 读取在大多数主要体系结构上都是原子的,只要它们与大小的倍数对齐(并且不大于给定类型的读取大小),请参阅英特尔体系结构手册。 Writes on the other hand many be different, Intel states that under x86, single byte write and aligned writes may be atomic, under IPF (IA64), everything use acquire and release semantics, which would make it guaranteed atomic, see this . 另一方面,写入许多不同,英特尔声称在x86下,单字节写入和对齐写入可能是原子的,在IPF(IA64)下,一切都使用获取和释放语义,这将使其保证原子,看到这一点

the volatile prevents the compiler from caching the value locally, forcing it to be retrieve where ever there is access to it. volatile阻止编译器在本地缓存该值,从而强制它在有权访问它的地方进行检索。

If you write for a specific architecture, you can make assumptions specific to it. 如果您针对特定体系结构编写,则可以对其进行特定的假设。
I guess IA-64 does compile these things to a single instruction. 我想IA-64确实将这些东西编译成一条指令。

The cache shouldn't be an issue, unless the counter crosses a cache line boundry. 缓存不应该是一个问题,除非计数器跨越缓存行边界。 But if 4/8 byte alignment is required, this can't happen. 但如果需要4/8字节对齐,则不会发生这种情况。

A "real" atomic instruction is required when a machine instruction translates into two memory accesses. 当机器指令转换为两次存储器访问时,需要“真正的”原子指令。 This is the case for increments (read, increment, write) or compare&swap. 这是增量(读取,增量,写入)或比较和交换的情况。

volatile affects the optimizations the compiler can do. volatile会影响编译器可以执行的优化。
For example, it prevents the compiler from converting multiple reads into one read. 例如,它可以防止编译器将多个读取转换为一个读取。
But on the machine instruction level, it does nothing. 但是在机器指令级别上,它什么也没做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM