[英]Should load-acquire see store-release immediately?
Suppose we have one simple variable( std::atomic<int> var
) and 2 threads T1
and T2
and we have the following code for T1
:假设我们有一个简单的变量(
std::atomic<int> var
)和 2 个线程T1
和T2
,我们有以下T1
代码:
...
var.store(2, mem_order);
...
and for T2
对于
T2
...
var.load(mem_order)
...
Also let's assume that T2
(load) executes 123ns later in time (later in the modification order in terms of the C++ standard) than T1
(store).还假设
T2
(加载)比T1
(存储)执行时间晚 123ns(按照 C++ 标准的修改顺序晚)。 My understanding of this situation is as follows(for different memory orders):我对这种情况的理解如下(针对不同的内存顺序):
memory_order_seq_cst
- T2
load is obliged to load 2
. memory_order_seq_cst
- T2
load 必须加载2
。 So effectively it has to load the latest value(just as it is the case with the RMW operations)memory_order_acquire
/ memory_order_release
/ memory_order_relaxed
- T2
is not obliged to load 2
but can load any older value with the only restriction: that value should not be older than the latest loaded by that thread. memory_order_acquire
/ memory_order_release
/ memory_order_relaxed
- T2
没有义务加载2
但可以加载任何较旧的值,唯一的限制是:该值不应早于该线程加载的最新值。 So, for example var.load
returns 0
.var.load
返回0
。 Am I right with my understanding?我的理解对吗?
UPDATE1:更新1:
If I'm wrong with the reasoning, please provide the text from the C++ standard which proofs it.如果我的推理有误,请提供证明它的 C++ 标准中的文本。 Not just theoretical reasoning of how some architecture might work.
不仅仅是对某些架构如何工作的理论推理。
Am I right with my understanding?
我的理解对吗?
No. You misunderstand memory orders.不,你误解了内存顺序。
let's assume that
T2
(load) executes 123ns later thanT1
(store)...让我们假设
T2
(加载)比T1
(存储)晚 123 纳秒执行...
In that case, T2 will see what T1 does with any type of memory orders(moreover, this property is applied to read/write of any memory region, see eg http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4431.pdf , 1.10, p.15).在这种情况下,T2 将看到 T1 对任何类型的内存订单做了什么(此外,此属性适用于任何内存区域的读/写,参见例如http://www.open-std.org/jtc1/sc22/ wg21/docs/papers/2015/n4431.pdf , 1.10, p.15)。 The key word in your phrase is later : it means that someone else forces ordering of these operations.
你的短语中的关键词是after :这意味着其他人强制对这些操作进行排序。
Memory orders are used for other scenario:内存命令用于其他场景:
Lets some operation OP1
comes in thread T1
before store operation, OP2
comes after it, OP3
comes in thread T2
before load operation, OP4
comes after it.让一些操作
OP1
在存储操作之前进入线程T1
, OP2
在它之后, OP3
在加载操作之前进入线程T2
, OP4
在它之后。
//T1: //T2:
OP1 OP3
var.store(2, mem_order) var.load(mem_order)
OP2 OP4
Assume, that some order between var.store()
and var.load()
can be observed by the threads.假设,线程可以观察到
var.store()
和var.load()
之间的某种顺序。 What one can garantee about cross threads order of other operations ?什么可以保证其他操作的跨线程顺序?
var.store
uses memory_order_release
, var.load
uses memory_order_acquire
and var.store
is ordered before var.load
(that is, load returns 2), then effect of OP1
is ordered before OP4
.var.store
使用memory_order_release
, var.load
使用memory_order_acquire
并且var.store
在var.load
之前var.load
(即 load 返回 2),那么OP1
效果在OP4
之前排序。 Eg, if OP1
writes some variable var1, OP4
reads that variable, then one can be assured that OP4
will read what OP1
write before.例如,如果
OP1
写入某个变量 var1, OP4
读取该变量,则可以确保OP4
会读取OP1
之前写入的内容。 This is the most utilized case.这是最常用的情况。
var.store
and var.load
uses memory_order_seq_cst
and var.store
is ordered after var.load
(that is, load returns 0, which was value of variable before store), then effect of OP2
is ordered after OP3
.var.store
和var.load
使用memory_order_seq_cst
并且var.store
在var.load
之后var.load
(即 load 返回 0,这是 store 之前的变量值),则OP2
效果在OP3
之后排序。 This memory order is required by some tricky syncronization schemes.某些棘手的同步方案需要此内存顺序。
var.store
or var.load
uses memory_order_relaxed
, then with any order of var.store
and var.load
one can garantee no order of cross threads operations.var.store
或var.load
使用memory_order_relaxed
,那么使用var.store
和var.load
任何顺序都可以保证没有跨线程操作的顺序。 This memory order is used in case, when someone else ensure order of operations.当其他人确保操作顺序时,将使用此内存顺序。 Eg, if thread
T2
creation comes after var.store
in T1
, then OP3
and OP4
are ordered after OP1
.例如,如果线程
T2
创建在T1
var.store
之后,则OP3
和OP4
在OP1
之后排序。
UPDATE : 123 ns later
implies *someone else* force ordering
because computer's processor has no notion about universal time, and no operation has precise moment when it is executed.更新:
123 ns later
意味着*someone else* force ordering
因为计算机的处理器没有关于世界时的概念,并且没有任何操作具有执行时的精确时刻。 For measure time between two operations you should:要测量两次操作之间的时间,您应该:
Transitively, these steps make ordering between the first operation and the second one.传递性地,这些步骤在第一个操作和第二个操作之间进行排序。
Having found no arguments to prove my understanding wrong I deem it correct and my proof is as follows:没有发现任何论据来证明我的理解是错误的,我认为它是正确的,我的证明如下:
memory_order_seq_cst - T2 load is obliged to load 2.
memory_order_seq_cst - T2 加载必须加载 2。
That's correct because all operations using memory_order_seq_cst
should form the single total order on the atomic variable of all the memory operations.这是正确的,因为所有使用
memory_order_seq_cst
操作memory_order_seq_cst
应该在所有内存操作的原子变量上形成单个全序。 Excerpt from the standard:标准摘录:
[29.9/3] There shall be a single total order S on all memory_order_seq_cst operations, consistent with the “happens before” order and modification orders for all affected locations, such that each memory_order_seq_cst operation B that loads a value from an atomic object M observes one of the following values <...>
[29.9/3]所有 memory_order_seq_cst 操作都应该有一个单一的总顺序 S,与所有受影响位置的“先发生”顺序和修改顺序一致,这样每个从原子对象 M 加载值的 memory_order_seq_cst 操作 B 观察以下值之一 <...>
The next point of my question:我的问题的下一点:
memory_order_acquire/memory_order_release/memory_order_relaxed - T2 is not obliged to load 2 but can load any older value <...>
memory_order_acquire/memory_order_release/memory_order_relaxed - T2 没有义务加载 2 但可以加载任何旧值 <...>
I didn't find any evidences which might indicate that the load executed later in the modification order should see the latest value.我没有找到任何可能表明在修改顺序中稍后执行的加载应该看到最新值的证据。 The only points I found for the store/load operations with any memory order different from the
memory_order_seq_cst
are these:对于具有与
memory_order_seq_cst
不同的任何内存顺序的存储/加载操作,我发现的唯一要点是:
[29.3/12] Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.
[29.3/12]实现应该在合理的时间内使原子存储对原子负载可见。
and和
[1.10/28] An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.
[1.10/28]实现应该确保原子或同步操作分配的最后一个值(按修改顺序)在有限的时间内对所有其他线程可见。
So the only guarantee we have is that the variable written will be visible within some time - that's pretty reasonable guarantee but it doesn't imply immediate visibility of the previous store.所以我们唯一的保证是写入的变量将在一段时间内可见——这是相当合理的保证,但并不意味着前一个存储的立即可见性。 And it proofs my second point.
它证明了我的第二点。
Given all that my initial understanding was correct.鉴于所有这些,我最初的理解是正确的。
123 nS later doesn't enforce of ordering T2 seeing the results of T1. 123 ns 之后不会强制对 T2 进行排序,查看 T1 的结果。 That's because if the physical program counter (transistors, etc.) running T2 is more than 40 Meters away from the physical program counter running T1 (large multi-core supercomputer, etc.), then the speed of light will not allow the T1 written state information to propagate that far (yet).
那是因为如果运行 T2 的物理程序计数器(晶体管等)与运行 T1 的物理程序计数器(大型多核超级计算机等)相距 40 米以上,那么光速将不允许 T1 写入状态信息传播那么远(还)。 Similar effect if the physical memory used for the load/stores is remote by some distance to both thread processors.
如果用于加载/存储的物理内存与两个线程处理器相距一定距离,则效果类似。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.