简体   繁体   English

load-acquire 应该立即看到 store-release 吗?

[英]Should load-acquire see store-release immediately?

Suppose we have one simple variable( std::atomic<int> var ) and 2 threads T1 and T2 and we have the following code for T1 :假设我们有一个简单的变量( std::atomic<int> var )和 2 个线程T1T2 ,我们有以下T1代码:

...
var.store(2, mem_order);
...

and for T2对于T2

...
var.load(mem_order)
...

Also let's assume that T2 (load) executes 123ns later in time (later in the modification order in terms of the C++ standard) than T1 (store).还假设T2 (加载)比T1 (存储)执行时间晚 123ns(按照 C++ 标准的修改顺序晚)。 My understanding of this situation is as follows(for different memory orders):我对这种情况的理解如下(针对不同的内存顺序):

  1. memory_order_seq_cst - T2 load is obliged to load 2 . memory_order_seq_cst - T2 load 必须加载2 So effectively it has to load the latest value(just as it is the case with the RMW operations)如此有效,它必须加载最新值(就像 RMW 操作的情况一样)
  2. memory_order_acquire / memory_order_release / memory_order_relaxed - T2 is not obliged to load 2 but can load any older value with the only restriction: that value should not be older than the latest loaded by that thread. memory_order_acquire / memory_order_release / memory_order_relaxed - T2没有义务加载2但可以加载任何较旧的值,唯一的限制是:该值不应早于该线程加载的最新值。 So, for example var.load returns 0 .因此,例如var.load返回0

Am I right with my understanding?我的理解对吗?

UPDATE1:更新1:

If I'm wrong with the reasoning, please provide the text from the C++ standard which proofs it.如果我的推理有误,请提供证明它的 C++ 标准中的文本。 Not just theoretical reasoning of how some architecture might work.不仅仅是对某些架构如何工作的理论推理。

Am I right with my understanding?我的理解对吗?

No. You misunderstand memory orders.不,你误解了内存顺序。

let's assume that T2 (load) executes 123ns later than T1 (store)...让我们假设T2 (加载)比T1 (存储)晚 123 纳秒执行...

In that case, T2 will see what T1 does with any type of memory orders(moreover, this property is applied to read/write of any memory region, see eg http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4431.pdf , 1.10, p.15).在这种情况下,T2 将看到 T1 对任何类型的内存订单做了什么(此外,此属性适用于任何内存区域的读/写,参见例如http://www.open-std.org/jtc1/sc22/ wg21/docs/papers/2015/n4431.pdf , 1.10, p.15)。 The key word in your phrase is later : it means that someone else forces ordering of these operations.你的短语中的关键词是after :这意味着其他人强制对这些操作进行排序。

Memory orders are used for other scenario:内存命令用于其他场景:

Lets some operation OP1 comes in thread T1 before store operation, OP2 comes after it, OP3 comes in thread T2 before load operation, OP4 comes after it.让一些操作OP1在存储操作之前进入线程T1OP2在它之后, OP3在加载操作之前进入线程T2OP4在它之后。

//T1:                         //T2:
OP1                           OP3
var.store(2, mem_order)       var.load(mem_order)
OP2                           OP4

Assume, that some order between var.store() and var.load() can be observed by the threads.假设,线程可以观察到var.store()var.load()之间的某种顺序。 What one can garantee about cross threads order of other operations ?什么可以保证其他操作的跨线程顺序

  1. If var.store uses memory_order_release , var.load uses memory_order_acquire and var.store is ordered before var.load (that is, load returns 2), then effect of OP1 is ordered before OP4 .如果var.store使用memory_order_releasevar.load使用memory_order_acquire并且var.storevar.load之前var.load (即 load 返回 2),那么OP1效果OP4之前排序。

Eg, if OP1 writes some variable var1, OP4 reads that variable, then one can be assured that OP4 will read what OP1 write before.例如,如果OP1写入某个变量 var1, OP4读取该变量,则可以确保OP4会读取OP1之前写入的内容。 This is the most utilized case.这是最常用的情况。

  1. If both var.store and var.load uses memory_order_seq_cst and var.store is ordered after var.load (that is, load returns 0, which was value of variable before store), then effect of OP2 is ordered after OP3 .如果var.storevar.load使用memory_order_seq_cst并且var.storevar.load之后var.load (即 load 返回 0,这是 store 之前的变量值),则OP2效果OP3之后排序。

This memory order is required by some tricky syncronization schemes.某些棘手的同步方案需要此内存顺序。

  1. If either var.store or var.load uses memory_order_relaxed , then with any order of var.store and var.load one can garantee no order of cross threads operations.如果var.storevar.load使用memory_order_relaxed ,那么使用var.storevar.load任何顺序都可以保证没有跨线程操作的顺序

This memory order is used in case, when someone else ensure order of operations.其他人确保操作顺序时,将使用此内存顺序。 Eg, if thread T2 creation comes after var.store in T1 , then OP3 and OP4 are ordered after OP1 .例如,如果线程T2创建在T1 var.store之后,则OP3OP4OP1之后排序。

UPDATE : 123 ns later implies *someone else* force ordering because computer's processor has no notion about universal time, and no operation has precise moment when it is executed.更新123 ns later意味着*someone else* force ordering因为计算机的处理器没有关于世界时的概念,并且没有任何操作具有执行时的精确时刻 For measure time between two operations you should:要测量两次操作之间的时间,您应该:

  1. Observe ordering between finishing the first operation and beginning time counting operation on some cpu .观察某些 cpu上完成第一个操作和开始计时操作之间的顺序。
  2. Observe ordering between beginning and finishing time counting operations.观察开始和结束计时操作之间的顺序。
  3. Observe ordering between finishing time counting operation and start of the second operation.观察完成计时操作和开始第二个操作之间的顺序。

Transitively, these steps make ordering between the first operation and the second one.传递性地,这些步骤在第一个操作和第二个操作之间进行排序。

Having found no arguments to prove my understanding wrong I deem it correct and my proof is as follows:没有发现任何论据来证明我的理解是错误的,我认为它是正确的,我的证明如下:

memory_order_seq_cst - T2 load is obliged to load 2. memory_order_seq_cst - T2 加载必须加载 2。

That's correct because all operations using memory_order_seq_cst should form the single total order on the atomic variable of all the memory operations.这是正确的,因为所有使用memory_order_seq_cst操作memory_order_seq_cst应该在所有内存操作的原子变量上形成单个全序。 Excerpt from the standard:标准摘录:

[29.9/3] There shall be a single total order S on all memory_order_seq_cst operations, consistent with the “happens before” order and modification orders for all affected locations, such that each memory_order_seq_cst operation B that loads a value from an atomic object M observes one of the following values <...> [29.9/3]所有 memory_order_seq_cst 操作都应该有一个单一的总顺序 S,与所有受影响位置的“先发生”顺序和修改顺序一致,这样每个从原子对象 M 加载值的 memory_order_seq_cst 操作 B 观察以下值之一 <...>

The next point of my question:我的问题的下一点:

memory_order_acquire/memory_order_release/memory_order_relaxed - T2 is not obliged to load 2 but can load any older value <...> memory_order_acquire/memory_order_release/memory_order_relaxed - T2 没有义务加载 2 但可以加载任何旧值 <...>

I didn't find any evidences which might indicate that the load executed later in the modification order should see the latest value.我没有找到任何可能表明在修改顺序中稍后执行的加载应该看到最新值的证据。 The only points I found for the store/load operations with any memory order different from the memory_order_seq_cst are these:对于具有与memory_order_seq_cst不同的任何内存顺序的存储/加载操作,我发现的唯一要点是:

[29.3/12] Implementations should make atomic stores visible to atomic loads within a reasonable amount of time. [29.3/12]实现应该在合理的时间内使原子存储对原子负载可见。

and

[1.10/28] An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time. [1.10/28]实现应该确保原子或同步操作分配的最后一个值(按修改顺序)在有限的时间内对所有其他线程可见。

So the only guarantee we have is that the variable written will be visible within some time - that's pretty reasonable guarantee but it doesn't imply immediate visibility of the previous store.所以我们唯一的保证是写入的变量将在一段时间内可见——这是相当合理的保证,但并不意味着前一个存储的立即可见性。 And it proofs my second point.它证明了我的第二点。

Given all that my initial understanding was correct.鉴于所有这些,我最初的理解是正确的。

123 nS later doesn't enforce of ordering T2 seeing the results of T1. 123 ns 之后不会强制对 T2 进行排序,查看 T1 的结果。 That's because if the physical program counter (transistors, etc.) running T2 is more than 40 Meters away from the physical program counter running T1 (large multi-core supercomputer, etc.), then the speed of light will not allow the T1 written state information to propagate that far (yet).那是因为如果运行 T2 的物理程序计数器(晶体管等)与运行 T1 的物理程序计数器(大型多核超级计算机等)相距 40 米以上,那么光速将不允许 T1 写入状态信息传播那么远(还)。 Similar effect if the physical memory used for the load/stores is remote by some distance to both thread processors.如果用于加载/存储的物理内存与两个线程处理器相距一定距离,则效果类似。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM