简体   繁体   English

对于这些关于 java volatile 和重新排序的代码,这种理解是否正确?

[英]Is this understanding correct for these code about java volatile and reordering?

According to this reorder rules根据这个重新排序规则

reorder Rules重新排序规则

if I have code like this如果我有这样的代码

volatile int a = 0;

boolean b = false;

foo1(){ a= 10; b = true;}

foo2(){if(b) {assert a==10;}}

Make Thread A to run foo1 and Thread b to run foo2, since a= 10 is a volatile store and b = true is a normal store, then these two statements could possible be reordered, which means in Thread B may have b = true while a!=10?让线程 A 运行 foo1 和线程 b 运行 foo2,因为 a= 10 是一个 volatile 存储而 b = true 是一个普通存储,那么这两个语句可能会被重新排序,这意味着在线程 B 中可能有 b = true 而a!=10? Is that correct?那是对的吗?

Added:添加:

Thanks for your answers!感谢您的回答!
I am just starting to learn about java multi-threading and have been troubled with keyword volatile a lot.我刚刚开始学习 Java 多线程,并且经常被关键字 volatile 所困扰。

Many tutorial talk about the visibility of volatile field, just like "volatile field becomes visible to all readers (other threads in particular) after a write operation completes on it".许多教程都在讨论 volatile 字段的可见性,就像“在对它的写操作完成后,所有读者(特别是其他线程)都可以看到 volatile 字段”。 I have doubt about how could a completed write on field being invisible to other Threads(or CPUS)?我怀疑其他线程(或 CPU)如何看不到已完成的字段写入?

As my understanding, a completed write means you have successfully written the filed back to cache, and according to the MESI, all others thread should have an Invalid cache line if this filed have been cached by them.据我了解,完成写入意味着您已成功将文件写回缓存,并且根据 MESI,如果该文件已被他们缓存,则所有其他线程都应具有无效缓存行。 One exception ( Since I am not very familiar with the hardcore, this is just a conjecture )is that maybe the result will be written back to the register instead of cache and I do not know whether there is some protocol to keep consistency in this situation or the volatile make it not to write to register in java.一个例外(因为我对硬核不是很熟悉,所以这只是一个猜测)是结果可能会写回寄存器而不是缓存,我不知道在这种情况下是否有一些协议来保持一致性或 volatile 使它不写入在 java 中注册。

In some situation that looks like "invisible" happens examples:在某些看起来像“隐形”的情况下会发生示例:

    A=0,B=0; 
    thread1{A=1; B=2;}  
    thread2{if(B==2) {A may be 0 here}}

suppose the compiler did not reorder it, what makes we see in thread2 is due to the store buffer, and I do not think a write operation in store buffer means a completed write.假设编译器没有对它重新排序,我们在thread2中看到的原因是存储缓冲区,我认为存储缓冲区中的写入操作并不意味着写入完成。 Since the store buffer and invalidate queue strategy, which make the write on variable A looks like invisible but in fact the write operation has not finished while thread2 read A. Even we make field B volatile, while we set a write operation on field B to the store buffer with memory barriers, thread 2 can read the b value with 0 and finish.由于存储缓冲区和使队列无效的策略,这使得对变量 A 的写入看起来不可见,但实际上写入操作尚未完成,而线程 2 读取 A。即使我们使字段 B 易失,而我们将字段 B 上的写入操作设置为带有内存屏障的存储缓冲区,线程 2 可以读取带有 0 的 b 值并完成。 As for me, the volatile looks like is not about the visibility of the filed it declared, but more like an edge to make sure that all the writes happens before volatile field write in ThreadA is visible to all operations after volatile field read( volatile read happens after volatile field write in ThreadA has completed ) in another ThreadB.对我来说, volatile 看起来不是关于它声明的字段的可见性,而是更像是一个边缘,以确保所有写入发生在 ThreadA 中的 volatile 字段写入对 volatile 字段读取之后的所有操作可见(易失性读取在 ThreadA 中的 volatile 字段写入完成后发生)在另一个 ThreadB 中。

By the way, since I am not an native speakers, I have seen may tutorials with my mother language(also some English tutorials) say that volatile will instruct JVM threads to read the value of volatile variable from main memory and do not cache it locally, and I do not think that is true.顺便说一句,由于我不是母语人士,我看到可能使用我的母语的教程(还有一些英文教程)说 volatile 会指示 JVM 线程从主内存中读取 volatile 变量的值,并且不在本地缓存它,我不认为这是真的。 Am I right?我对吗?

Anyway, Thanks for your answers, since not a native speakers, I hope I have made my expression clearly.不管怎样,谢谢你的回答,因为不是母语人士,我希望我表达清楚。

I'm pretty sure the assert can fire.我很确定断言可以触发。 I think a volatile load is only an acquire operation ( https://preshing.com/20120913/acquire-and-release-semantics/ ) wrt.我认为易失性负载只是获取操作( https://preshing.com/20120913/acquire-and-release-semantics/)wrt non-volatile variables, so nothing is stopping load-load reordering.非易失性变量,所以没有什么能阻止加载-加载重新排序。

Two volatile operations couldn't reorder with each other, but reordering with non-atomic operations is possible in one direction, and you picked the direction without guarantees.两个volatile操作不能相互重新排序,但是可以在一个方向上使用非原子操作重新排序,并且您选择了没有保证的方向。

(Caveat, I'm not a Java expert; it's possible but unlikely volatile has some semantics that require a more expensive implementation.) (注意,我不是 Java 专家;可能但不太可能volatile具有一些需要更昂贵实现的语义。)


More concrete reasoning is that if the assert can fire when translated into asm for some specific architecture, it must be allowed to fire by the Java memory model.更具体的推理是,如果断言在转换为某些特定体系结构的 asm 时可以触发,则必须允许 Java 内存模型触发。

Java volatile is (AFAIK) equivalent to C++ std::atomic with the default memory_order_seq_cst . Java volatile是(AFAIK)等价于 C++ std::atomic和默认的memory_order_seq_cst Thus foo2 can JIT-compile for ARM64 with a plain load for b and an LDAR acquire load for a .因此foo2罐JIT编译为ARM64与一个普通的负载b和用于获取LDAR负载a

ldar can't reorder with later loads/stores, but can with earlier. ldar不能在较晚的加载/存储中重新排序,但可以与较早的时间进行重新排序。 (Except for stlr release stores; ARM64 was specifically designed to make C++ std::atomic<> with memory_order_seq_cst / Java volatile efficient with ldar and stlr , not having to flush the store buffer immediately on seq_cst stores, only on seeing an LDAR, so that design gives the minimal amount of ordering necessary to still recover sequential consistency as specified by C++ (and I assume Java).) (除了stlr发布存储;ARM64 专门设计用于使 C++ std::atomic<> with memory_order_seq_cst / Java volatileldarstlr ,不必立即在 seq_cst 存储上刷新存储缓冲区,仅在看到 LDAR 时,所以该设计提供了仍然恢复 C++ 指定的顺序一致性所需的最少排序(我假设是 Java)。

On many other ISAs, sequential-consistency stores do need to wait for the store buffer to drain itself, so they are in practice ordered wrt.在许多其他 ISA 上,顺序一致性存储确实需要等待存储缓冲区自行耗尽,因此它们实际上是有序的。 later non-atomic loads.后来的非原子负载。 And again on many ISAs, an acquire or SC load is done with a normal load preceded with a barrier which blocks loads from crossing it in either direction, otherwise they wouldn't work .再次在许多 ISA 上,获取或 SC 负载是通过正常负载完成的,然后是一个屏障,该屏障阻止负载在任一方向穿过它, 否则它们将无法工作 That's why having the volatile load of a compile to an acquire-load instruction that just does an acquire operation is key to understanding how this can happen in practice.这就是为什么具有挥发性负载a给acquire负载指令只是做一个获取操作的关键是了解如何在实践中发生的编译。

(In x86 asm, all loads are acquire loads and all stores are release stores. Not sequential-release, though; x86's memory model is program order + store buffer with store-forwarding, which allows StoreLoad reordering, so Java volatile stores need special asm. (在 x86 asm 中,所有加载都是获取加载,所有存储都是释放存储。不过不是顺序释放;x86 的内存模型是程序顺序 + 带有存储转发的存储缓冲区,这允许 StoreLoad 重新排序,因此 Java volatile存储需要特殊的 asm .

So the assert can't fire on x86, except via compile/JIT-time reordering of the assignments .因此,断言不能在 x86 上触发,除非通过对 assignments 的编译/JIT 时间重新排序 This is a good example of one reason why testing lock-free code is hard: a failing test can prove there is a problem, but testing on some hardware/software combo can't prove correctness.)这是测试无锁代码之所以困难的一个很好的例子:失败的测试可以证明存在问题,但在某些硬件/软件组合上的测试无法证明正确性。)

In addition to Peter Cordes his great answer, in terms of the JMM there is a data race on b since there is no happens before edge between the write of b and the read of b because it is a plain variable.除了 Peter Cordes 的出色回答之外,就 JMM 而言,b 上存在数据竞争,因为 b 的写入和 b 的读取之间的边缘之前没有发生,因为它是一个普通变量。 Only if this happens before edge would exist, then you are guaranteed that if load of b=1 that also the load of a=1 is seen.只有在边缘存在之前发生这种情况,然后才能保证如果 b=1 的负载也看到 a=1 的负载。

Instead of making a volatile, you need to make b volatile.你需要让 b 变得不稳定,而不是让 a 变得不稳定。

int a=0;
volatile int b=0;

thread1(){
    a=1
    b=1
}

thread2(){
  if(b==1) assert a==1;
}

So if thread2 sees b=1, then this read is ordered before the write of b=1 in the happens before order (volatile variable rule).因此,如果线程 2 看到 b=1,那么在发生之前的顺序(易失性变量规则)中,此读取将在写入 b=1 之前进行排序。 And since a=1 and b=1 are ordered happens before order (program order rule), and read of b and the read of a are ordered in the happens before order (program order rule again), then due to the transitive nature of the happens before relation, there is a happens before edge between the write of a=1 and the read of a;并且由于 a=1 和 b=1 是有序发生在顺序之前(程序顺序规则),并且 b 的读取和 a 的读取在发生在顺序之前(再次程序顺序规则)中进行排序,那么由于发生在关系之前,在 a=1 的写入和 a 的读取之间有一个发生在边缘之前; which needs to see the value 1.这需要看到值 1。

You are referring to a possible implementation of the JMM using fences.您指的是使用围栏的 JMM 的可能实现。 And although it provides some insights into what happens under the hood, it is equally damaging to think in terms of fences because they are not a suitable mental model.虽然它提供了一些关于引擎盖下发生的事情的见解,但从围栏的角度思考同样具有破坏性,因为它们不是一个合适的心理模型。 See the following counter example:请参阅以下计数器示例:

https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#myth-barriers-are-sane https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#myth-barriers-are-sane

Yes, the assert can fail.是的,断言可能会失败。

volatile int a = 0;

boolean b = false;

foo1(){ a= 10; b = true;}

foo2(){if(b) {assert a==10;}}

The JMM guarantees that writes to volatile fields happen-before reads from them. JMM 保证对volatile字段的写入发生在从它们读取之前 In your example, whatever thread a did before a = 10 will happen-before whatever thread b does after reading a (while executing assert a == 10 ).在您的示例中,任何线程 a 在a = 10之前所做的都将发生在读取 a 之后线程 b 所做的任何事情之前(同时执行assert a == 10 )。 Since b = true executes after a = 10 for thread a (for a single thread, happens-before is always holds), there is no guarantee that there'll be an ordering guarantee.由于b = true在线程 a 的a = 10之后执行(对于单个线程, happens-before总是成立),因此不能保证会有排序保证。 However, consider this:但是,请考虑:

int a = 0;

volatile boolean b = false;

foo1(){ a= 10; b = true;}

foo2(){if(b) {assert a==10;}}

In this example, the situation is:在这个例子中,情况是:

a = 10 ---> b = true---|
                       |
                       | (happens-before due to volatile's semantics)
                       |
                       |---> if(b) ---> assert a == 10

                

Since you have a total order, the assert is guaranteed to pass.由于您有一个总订单,所以断言保证通过。

Answer to your addition.回答你的补充。

Many tutorial talk about the visibility of volatile field, just like "volatile field becomes visible to all readers (other threads in particular) after a write operation completes on it".许多教程都在讨论 volatile 字段的可见性,就像“在对它的写操作完成后,所有读者(特别是其他线程)都可以看到 volatile 字段”。 I have doubt about how could a completed write on field being invisible to other Threads(or CPUS)?我怀疑其他线程(或 CPU)如何看不到已完成的字段写入?

The compiler might mess up code.编译器可能会弄乱代码。

eg例如

boolean stop;

void run(){
  while(!stop)println();
}

first optimization第一次优化

void run(){
   boolean r1=stop;
   while(!r1)println();
}

second optimization第二次优化

void run(){
   boolean r1=stop;
   if(!r1)return;
   while(true) println();
}

So now it is obvious this loop will never stop because effectively the new value to stop will never been seen.所以现在很明显这个循环永远不会停止,因为实际上永远不会看到要停止的新值。 For store you can do something similar that could indefinitely postpone it.对于商店,你可以做一些类似的事情,可以无限期地推迟它。

As my understanding, a completed write means you have successfully written the filed back to cache, and according to the MESI, all others thread should have an Invalid cache line if this filed have been cached by them.据我了解,完成写入意味着您已成功将文件写回缓存,并且根据 MESI,如果该文件已被他们缓存,则所有其他线程都应具有无效缓存行。

Correct.正确的。 This is normally called 'globally visible' or 'globally performed'.这通常称为“全局可见”或“全局执行”。

One exception ( Since I am not very familiar with the hardcore, this is just a conjecture )is that maybe the result will be written back to the register instead of cache and I do not know whether there is some protocol to keep consistency in this situation or the volatile make it not to write to register in java.一个例外(因为我对硬核不是很熟悉,所以这只是一个猜测)是结果可能会写回寄存器而不是缓存,我不知道在这种情况下是否有一些协议来保持一致性或 volatile 使它不写入在 java 中注册。

All modern processors are load/store architectures (even X86 after uops conversion) meaning that there are explicit load and store instructions that transfer data between registers and memory and regular instructions like add/sub can only work with registers.所有现代处理器都是加载/存储架构(甚至是 uops 转换后的 X86),这意味着存在显式加载和存储指令,可以在寄存器和内存之间传输数据,而像 add/sub 这样的常规指令只能与寄存器一起使用。 So a register needs to be used anyway.所以无论如何都需要使用寄存器。 The key part is that the compiler should respect the loads/stores of the source code and limit optimizations.关键部分是编译器应该尊重源代码的加载/存储并限制优化。

suppose the compiler did not reorder it, what makes we see in thread2 is due to the store buffer, and I do not think a write operation in store buffer means a completed write.假设编译器没有对它重新排序,我们在thread2中看到的原因是存储缓冲区,我认为存储缓冲区中的写入操作并不意味着写入完成。 Since the store buffer and invalidate queue strategy, which make the write on variable A looks like invisible but in fact the write operation has not finished while thread2 read A.由于存储缓冲区和使队列无效的策略,这使得对变量 A 的写入看起来不可见,但实际上在线程 2 读取 A 时写入操作尚未完成。

On the X86 the order of the stores in the store buffer are consistent with program order and will commit to the cache in program order.在 X86 上,存储缓冲区中的存储顺序与程序顺序一致,并将按程序顺序提交到缓存。 But there are architectures where stores from the store buffer can commit to the cache out of order eg due to:但是在某些架构中,存储缓冲区中的存储可以无序提交到缓存,例如由于:

  • write coalescing写合并

  • allowing stores to commit to cache as soon as the cache line is returned in the right state no matter if an earlier still is still waiting.允许存储在缓存行以正确状态返回时立即提交缓存,无论较早的是否仍在等待。

  • sharing the store buffer with a subset of the CPUs.与 CPU 的子集共享存储缓冲区。

Store buffers can be a source of reordering;存储缓冲区可能是重新排序的来源; but also out of order and speculative execution can be a source.但也可能是乱序和推测性执行的一个来源。

Apart from the stores, reordering loads can also lead to observing stores out of order.除了存储之外,重新排序负载也可能导致观察存储无序。 On the X86 loads can't be reordered, but on the ARM it is allowed. X86 上的负载不能重新排序,但在 ARM 上是允许的。 And of course the JIT can mess things up as well.当然,JIT 也会把事情搞砸。

Even we make field B volatile, while we set a write operation on field B to the store buffer with memory barriers, thread 2 can read the b value with 0 and finish.即使我们将字段 B 设置为 volatile,当我们将字段 B 上的写操作设置为带有内存屏障的存储缓冲区时,线程 2 可以读取带有 0 的 b 值并完成。

It is important to realize that the JMM is based on sequential consistency;认识到 JMM 基于顺序一致性很重要; so even though it is a relaxed memory model (separation of plain loads and stores vs synchronization actions like volatile load/store lock/unlock) if a program has no data races, it will only produce sequential consistent executions.因此,即使它是一个宽松的内存模型(将普通加载和存储与同步操作(如易失性加载/存储锁定/解锁)分开),如果程序没有数据竞争,它也只会产生顺序一致的执行。 For sequential consistency the real time order doesn't need to be respected.对于顺序一致性,不需要遵守实时顺序。 So it is perfectly fine for a load/store to be skewed as long as:因此,只要满足以下条件,加载/存储就完全没有问题:

  1. there memory order is a total order over all loads/stores内存顺序是所有加载/存储的总顺序

  2. the memory order is consistent with the program order内存顺序与程序顺序一致

  3. a load sees the most recent write before it in the memory order.负载会在内存顺序中看到它之前的最新写入。

As for me, the volatile looks like is not about the visibility of the filed it declared, but more like an edge to make sure that all the writes happens before volatile field write in ThreadA is visible to all operations after volatile field read( volatile read happens after volatile field write in ThreadA has completed ) in another ThreadB.对我来说, volatile 看起来不是关于它声明的字段的可见性,而是更像是一个边缘,以确保所有写入发生在 ThreadA 中的 volatile 字段写入对 volatile 字段读取之后的所有操作可见(易失性读取在 ThreadA 中的 volatile 字段写入完成后发生)在另一个 ThreadB 中。

You are on the right path.你走在正确的道路上。

Example.例子。

int a=0
volatile int b=;

thread1(){
   1:a=1
   2:b=1
}

thread2(){
   3:r1=b
   4:r2=a
}

In this case there is a happens before edge between 1-2 (program order).在这种情况下,在 1-2(程序顺序)之间的边缘之前会发生一个。 If r1=1, then there is happens before edge between 2-3 (volatile variable) and a happens before edge between 3-4 (program order).如果 r1=1,则在 2-3(易失性变量)之间的边缘之前发生,并且在 3-4(程序顺序)之间的边缘之前发生。

Because the happens before relation is transitive, there is a happens before edge between 1-4.因为发生在关系之前是可传递的,所以在 1-4 之间的边之前发生了。 So r2 must be 1.所以 r2 必须是 1。

volatile takes care of the following: volatile 负责以下内容:

  • Visibility: needs to make sure the load/store doesn't get optimized out.可见性:需要确保加载/存储不会得到优化。

  • That is load/store is atomic.也就是说加载/存储是原子的。 So a load/store should not be seen partially.因此,不应部分看到加载/存储。

  • And most importantly, it needs to make sure that the order between 1-2 and 3-4 is preserved.最重要的是,它需要确保保留 1-2 和 3-4 之间的顺序。

By the way, since I am not an native speakers, I have seen may tutorials with my mother language(also some English tutorials) say that volatile will instruct JVM threads to read the value of volatile variable from main memory and do not cache it locally, and I do not think that is true.顺便说一句,由于我不是母语人士,我看到可能使用我的母语的教程(还有一些英文教程)说 volatile 会指示 JVM 线程从主内存中读取 volatile 变量的值,并且不在本地缓存它,我不认为这是真的。

You are completely right.你是完全正确的。 This is a very common misconception.这是一个非常普遍的误解。 Caches are the source of truth since they are always coherent.缓存是事实的来源,因为它们总是连贯的。 If every write needs to go to main memory, programs would become extremely slow.如果每次写入都需要进入主内存,程序将变得非常缓慢。 Memory is just a spill bucket for whatever doesn't fit in cache and can be completely incoherent with the cache.内存只是一个溢出桶,用于存放不适合缓存的内容,并且可能与缓存完全不一致。 Plain/volatile loads/stores are stored in the cache.普通/易失性加载/存储存储在缓存中。 It is possible to bypass the cache for special situations like MMIO or when using eg SIMD instructions but it isn't relevant for these examples.可以在特殊情况下(如 MMIO)或使用 SIMD 指令时绕过缓存,但这与这些示例无关。

Anyway, Thanks for your answers, since not a native speakers, I hope I have made my expression clearly.不管怎样,谢谢你的回答,因为不是母语人士,我希望我表达清楚。

Most people here are not a native speaker (I'm certainly not).这里的大多数人都不是母语人士(我当然不是)。 Your English is good enough and you show a lot of promise.你的英语足够好,你表现出很大的希望。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM