简体   繁体   English

与原子变量相同的高速缓存行的非原子加载会导致原子变量失败吗?

[英]Will a non-atomic load to the same cache line as an atomic variable cause the atomic variable to fail?

Given something like this on an ARMv8 CPU (though this may apply to many others as well): 在ARMv8 CPU上给出类似的东西(尽管这可能也适用于许多其他CPU):

class abcxzy 
{
  // Pragma align to cacheline to ensure they exist on same line.
  unit32_t atomic_data;
  uint32_t data;

  void foo()
  {
    volatile asm (
      "   ldr w0, [address of data]\n"
      "# Do stuff with data in w0..."
      "   str w0, [address of data]\n"

      "1: ldaxr w0, [address of atomic_data]\n"
      "   add w1, w0, #0x1\n"
      "   stxr w2,w1, [address of atomic_data]\n"
      "   cbnz w2, 1b\n"
    );
  }
}

With proper clobbers and such set on the Asm inline so that C and Asm can coexist happily in a world of rainbow ponies and sunshine. 在Asm内联中使用适当的clobbers和这样的设置,以便C和Asm可以在彩虹小马和阳光的世界中愉快地共存。

In a multiple CPU situation, all running this code at the same time, will the stores to data cause the atomic load/store to atomic_data to fail? 在多CPU的情况下,所有在同一时间运行此代码,将门店data引起的原子加载/存储到atomic_data失败? From what I've read, the ARM atomic stuff works on a cache line basis, but it is not clear if the non-atomic store will affect the atomic. 从我所读到的,ARM原子的东西在缓存行的基础上工作,但不清楚非原子存储是否会影响原子。 I hope that it it doesn't (and assume that it does...), but I am looking to see if anyone else can confirm this. 我希望它不会(并假设它确实...),但我希望看看是否有其他人可以证实这一点。

Ok, finally found what I needed, though I don't like it: 好的,终于找到了我需要的东西,虽然我不喜欢它:

According to the ARM documentation, It is IMPLEMENTATION DEFINED whether a non-exclusive store to the same cache line as the exclusive store causes the exclusive store to fail. 根据ARM文档,实现定义是否与专用存储区相同的高速缓存行的非独占存储导致专用存储失败。 Thanks ARM. 谢谢ARM。 Appreciate that wonderful non-conclusive info. 欣赏那些精彩的非定论信息。


Edit: 编辑:

By fail, I mean the stxr command did not write to memory and returned a "1" in the status register. 失败,我的意思是stxr命令没有写入内存并在状态寄存器中返回“1”。 "Your atomic data updated and needs new RMW" status. “您的原子数据已更新,需要新的RMW”状态。

To answer other statements: 回答其他陈述:

  • Yes, atomic critical areas should be as small as possible. 是的,原子临界区应该尽可能小。 The docs event give numbers on how small, and they are very reasonable indeed. docs事件给出的数字有多小,而且确实非常合理。 I hope that my sections never span 1k or more... 我希望我的部分永远不会超过1k或更多......

  • And yes, any situation where you would need to worry about this kind of contention killing performance or worse means your code is "doing it wrong." 是的,任何你需要担心这种争用杀戮性能或更糟糕的情况意味着你的代码“做错了”。 The ARM docs are state this in a round about manner :) ARM文档以一种方式陈述这种方式:)

  • As to putting the non-atomic loads and stores inside the atomics - my pseudo test above was just demonstrating a random access to the same cache line as an example. 至于将非原子载荷和存储放在原子中 - 我上面的伪测试只是演示了对同一缓存行的随机访问作为一个例子。 In real code, you obviously should avoid this. 在实际代码中,你显然应该避免这种情况。 I was just trying to get a feeling for how "bad" it might be if, perhaps a high speed hardware timer store was hitting the same cache line as a lock. 我只是试图了解一下,如果高速硬件定时器存储器与锁相同的高速缓存行,可能会有多么糟糕。 Again, don't do this... 再说一次,不要这样做......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM