简体   繁体   English

C++ 标准如何使用 memory_order_acquire 和 memory_order_release 防止自旋锁互斥锁中的死锁?

[英]How C++ Standard prevents deadlock in spinlock mutex with memory_order_acquire and memory_order_release?

TL:DR: if a mutex implementation uses acquire and release operations, could an implementation do compile-time reordering like would normally be allowed and overlap two critical sections that should be independent, from different locks? TL:DR:如果互斥锁实现使用获取和释放操作,那么实现是否可以像通常允许的那样进行编译时重新排序并重叠两个应该独立的关键部分,来自不同的锁? This would lead to a potential deadlock.这将导致潜在的僵局。


Assume a mutex is inmplement on std::atomic_flag :假设在std::atomic_flag上实现了互斥锁:

struct mutex
{
   void lock() 
   {
       while (lock.test_and_set(std::memory_order_acquire)) 
       {
          yield_execution();
       }
   }

   void unlock()
   {
       lock.clear(std::memory_order_release);
   }

   std::atomic_flag lock; // = ATOMIC_FLAG_INIT in pre-C++20
};

So far looks ok, regarding using single such mutex: std::memory_order_release is sychronized with std::memory_order_acquire .到目前为止看起来还不错,关于使用单个这样的互斥锁: std::memory_order_releasestd::memory_order_acquire同步。

The use of std::memory_order_acquire / std::memory_order_release here should not raise questions at the first sight.在这里使用std::memory_order_acquire / std::memory_order_release不应该一见钟情。 They are similar to cppreference example https://en.cppreference.com/w/cpp/atomic/atomic_flag它们类似于 cppreference 示例https://en.cppreference.com/w/cpp/atomic/atomic_flag

Now there are two mutexes guarding different variables, and two threads accessing them in different order:现在有两个互斥锁保护不同的变量,两个线程以不同的顺序访问它们:

mutex m1;
data  v1;

mutex m2;
data  v2;

void threadA()
{
    m1.lock();
    v1.use();
    m1.unlock();

    m2.lock();
    v2.use();
    m2.unlock();
}

void threadB()
{
    m2.lock();
    v2.use();
    m2.unlock();

    m1.lock();
    v1.use();
    m1.unlock();
}

Release operations can be reordered after unrelated acquire operation (unrelated operation == a later operation on a different object), so the execution could be transformed as follows:释放操作可以在不相关的获取操作之后重新排序(不相关的操作 == 对不同对象的后续操作),因此可以将执行转换如下:

mutex m1;
data  v1;

mutex m2;
data  v2;

void threadA()
{
    m1.lock();
    v1.use();

    m2.lock();
    m1.unlock();

    v2.use();
    m2.unlock();
}

void threadB()
{
    m2.lock();
    v2.use();

    m1.lock();
    m2.unlock();

    v1.use();
    m1.unlock();
}

So it looks like there is a deadlock.所以看起来有一个僵局。

Questions:问题:

  1. How Standard prevents from having such mutexes?标准如何防止出现此类互斥锁?
  2. What is the best way to have spin lock mutex not suffering from this issue?让自旋锁互斥体不受此问题困扰的最佳方法是什么?
  3. Is the unmodified mutex from the top of this post usable for some cases?这篇文章顶部的未修改互斥锁是否可用于某些情况?

(Not a duplicate of C++11 memory_order_acquire and memory_order_release semantics? , though it is in the same area) (不是C++11 memory_order_acquire 和 memory_order_release 语义的重复? ,虽然它在同一个区域)

There's no problem in the ISO C++ standard; ISO C++标准没有问题; it doesn't distinguish compile-time vs. run-time reordering, and the code still has to execute as if it ran in source order on the C++ abstract machine.它不区分编译时和运行时重新排序,并且代码仍然必须像在 C++ 抽象机上按源顺序运行一样执行。 So the effects of m2.test_and_set(std::memory_order_acquire) trying to take the 2nd lock can become visible to other threads while still holding the first (ie before m1.reset ), but failure there can't prevent m1 from ever being released.所以m2.test_and_set(std::memory_order_acquire)尝试获取第二个锁的影响可以在仍然持有第一个锁的同时对其他线程可见(即在m1.reset之前),但是那里的失败不能阻止m1被释放.

The only way we'd have a problem is if compile-time reordering nailed down that order into asm for some machine, such that the m2 lock retry loop had to exit before actually releasing m1 .我们遇到问题的唯一方法是,如果编译时重新排序将该命令确定为某些机器的 asm,这样m2锁定重试循环必须在实际释放m1之前退出。

Also, ISO C++ only defines ordering in terms of synchronizes-with and what can see what, not in terms of re -ordering operations relative into some new order.此外,ISO C++ 仅根据同步和可以看到的内容定义排序,而不是相对于某些新顺序的重新排序操作。 That would imply some order existed.这意味着存在某种秩序。 No such order that multiple threads can agree on is even guaranteed to exist for separate objects, unless you use seq_cst operations.除非您使用 seq_cst 操作,否则甚至不能保证单独的对象存在多个线程可以达成一致的这种顺序。 (And a modification order for each object separately is guaranteed to exist.) (并且保证每个object单独的修改订单存在。)

The 1-way-barrier model of acquire and release operations (like the diagram in https://preshing.com/20120913/acquire-and-release-semantics ) is a convenient way to think about things, and matches reality for pure-loads and pure-stores on x86 and AArch64 for example.获取和释放操作的 1-way-barrier model (如https://preshing.com/20120913/acquire-and-release-semantics中的图表)是一种方便的思考方式,并且与现实相匹配。例如,在 x86 和 AArch64 上加载和纯存储。 But as far as language-lawyering, it's not how the ISO C++ standard defines things.但就语言律师而言,这不是 ISO C++ 标准定义事物的方式。


You're reordering a whole retry loop, not just a single acquire您正在重新排序整个重试循环,而不仅仅是一次获取

Reordering an atomic operation across a long-running loop is a theoretical problem allowed by the C++ standard.在长时间运行的循环中重新排序atomic操作是 C++ 标准允许的理论问题。 P0062R1: When should compilers optimize atomics? P0062R1:编译器何时应该优化原子? points out that delaying a store until after a long-running loop is technically allowed by standard's wording of 1.10p28:指出标准的 1.10p28 措辞在技术上允许将存储延迟到长时间运行的循环之后:

An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time .实现应确保由原子或同步操作分配的最后一个值(按修改顺序)将在有限的时间段内对所有其他线程可见。

But a potentially infinite loop would violate that, not being finite in the deadlock case for example, so compilers must not do that.但是一个潜在的无限循环会违反这一点,例如在死锁情况下不是有限的,所以编译器不能这样做。

It's not "just" a quality-of-implementation issue.这不仅仅是一个实施质量问题。 A successful mutex lock is an acquire operation, but you should not look at the retry loop as a single acquire operation.成功的互斥锁是一个获取操作,但您不应将重试循环视为单个获取操作。 Any sane compiler won't.任何理智的编译器都不会。

(The classic example of something that aggressive atomics optimization could break is a progress bar, where the compiler sinks all the relaxed stores out of a loop and then folds all the dead stores into one final store of 100%. See also this Q&A - current compilers don't, and basically treat atomic as volatile atomic until C++ solves the problem of giving programmers a way to let the compiler know when atomics can/can't be optimized safely.) (激进的原子优化可能破坏的经典示例是进度条,其中编译器将所有松弛存储从循环中删除,然后将所有死存储折叠到一个 100% 的最终存储中。另见此问答- 当前编译器不会,并且基本上将atomic视为volatile atomic ,直到 C++ 解决了为程序员提供一种方法让编译器知道何时可以/不能安全地优化 atomic 的问题。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM