[英]How C++ Standard prevents deadlock in spinlock mutex with memory_order_acquire and memory_order_release?
TL:DR: if a mutex implementation uses acquire and release operations, could an implementation do compile-time reordering like would normally be allowed and overlap two critical sections that should be independent, from different locks? TL:DR:如果互斥锁实现使用获取和释放操作,那么实现是否可以像通常允许的那样进行编译时重新排序并重叠两个应该独立的关键部分,来自不同的锁? This would lead to a potential deadlock.
这将导致潜在的僵局。
Assume a mutex is inmplement on std::atomic_flag
:假设在
std::atomic_flag
上实现了互斥锁:
struct mutex
{
void lock()
{
while (lock.test_and_set(std::memory_order_acquire))
{
yield_execution();
}
}
void unlock()
{
lock.clear(std::memory_order_release);
}
std::atomic_flag lock; // = ATOMIC_FLAG_INIT in pre-C++20
};
So far looks ok, regarding using single such mutex: std::memory_order_release
is sychronized with std::memory_order_acquire
.到目前为止看起来还不错,关于使用单个这样的互斥锁:
std::memory_order_release
与std::memory_order_acquire
同步。
The use of std::memory_order_acquire
/ std::memory_order_release
here should not raise questions at the first sight.在这里使用
std::memory_order_acquire
/ std::memory_order_release
不应该一见钟情。 They are similar to cppreference example https://en.cppreference.com/w/cpp/atomic/atomic_flag它们类似于 cppreference 示例https://en.cppreference.com/w/cpp/atomic/atomic_flag
Now there are two mutexes guarding different variables, and two threads accessing them in different order:现在有两个互斥锁保护不同的变量,两个线程以不同的顺序访问它们:
mutex m1;
data v1;
mutex m2;
data v2;
void threadA()
{
m1.lock();
v1.use();
m1.unlock();
m2.lock();
v2.use();
m2.unlock();
}
void threadB()
{
m2.lock();
v2.use();
m2.unlock();
m1.lock();
v1.use();
m1.unlock();
}
Release operations can be reordered after unrelated acquire operation (unrelated operation == a later operation on a different object), so the execution could be transformed as follows:释放操作可以在不相关的获取操作之后重新排序(不相关的操作 == 对不同对象的后续操作),因此可以将执行转换如下:
mutex m1;
data v1;
mutex m2;
data v2;
void threadA()
{
m1.lock();
v1.use();
m2.lock();
m1.unlock();
v2.use();
m2.unlock();
}
void threadB()
{
m2.lock();
v2.use();
m1.lock();
m2.unlock();
v1.use();
m1.unlock();
}
So it looks like there is a deadlock.所以看起来有一个僵局。
Questions:问题:
(Not a duplicate of C++11 memory_order_acquire and memory_order_release semantics? , though it is in the same area) (不是C++11 memory_order_acquire 和 memory_order_release 语义的重复? ,虽然它在同一个区域)
There's no problem in the ISO C++ standard; ISO C++标准没有问题; it doesn't distinguish compile-time vs. run-time reordering, and the code still has to execute as if it ran in source order on the C++ abstract machine.
它不区分编译时和运行时重新排序,并且代码仍然必须像在 C++ 抽象机上按源顺序运行一样执行。 So the effects of
m2.test_and_set(std::memory_order_acquire)
trying to take the 2nd lock can become visible to other threads while still holding the first (ie before m1.reset
), but failure there can't prevent m1
from ever being released.所以
m2.test_and_set(std::memory_order_acquire)
尝试获取第二个锁的影响可以在仍然持有第一个锁的同时对其他线程可见(即在m1.reset
之前),但是那里的失败不能阻止m1
被释放.
The only way we'd have a problem is if compile-time reordering nailed down that order into asm for some machine, such that the m2
lock retry loop had to exit before actually releasing m1
.我们遇到问题的唯一方法是,如果编译时重新排序将该命令确定为某些机器的 asm,这样
m2
锁定重试循环必须在实际释放m1
之前退出。
Also, ISO C++ only defines ordering in terms of synchronizes-with and what can see what, not in terms of re -ordering operations relative into some new order.此外,ISO C++ 仅根据同步和可以看到的内容定义排序,而不是相对于某些新顺序的重新排序操作。 That would imply some order existed.
这意味着存在某种秩序。 No such order that multiple threads can agree on is even guaranteed to exist for separate objects, unless you use seq_cst operations.
除非您使用 seq_cst 操作,否则甚至不能保证单独的对象存在多个线程可以达成一致的这种顺序。 (And a modification order for each object separately is guaranteed to exist.)
(并且保证每个object单独的修改订单存在。)
The 1-way-barrier model of acquire and release operations (like the diagram in https://preshing.com/20120913/acquire-and-release-semantics ) is a convenient way to think about things, and matches reality for pure-loads and pure-stores on x86 and AArch64 for example.获取和释放操作的 1-way-barrier model (如https://preshing.com/20120913/acquire-and-release-semantics中的图表)是一种方便的思考方式,并且与现实相匹配。例如,在 x86 和 AArch64 上加载和纯存储。 But as far as language-lawyering, it's not how the ISO C++ standard defines things.
但就语言律师而言,这不是 ISO C++ 标准定义事物的方式。
Reordering an atomic
operation across a long-running loop is a theoretical problem allowed by the C++ standard.在长时间运行的循环中重新排序
atomic
操作是 C++ 标准允许的理论问题。 P0062R1: When should compilers optimize atomics? P0062R1:编译器何时应该优化原子? points out that delaying a store until after a long-running loop is technically allowed by standard's wording of 1.10p28:
指出标准的 1.10p28 措辞在技术上允许将存储延迟到长时间运行的循环之后:
An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time .
实现应确保由原子或同步操作分配的最后一个值(按修改顺序)将在有限的时间段内对所有其他线程可见。
But a potentially infinite loop would violate that, not being finite in the deadlock case for example, so compilers must not do that.但是一个潜在的无限循环会违反这一点,例如在死锁情况下不是有限的,所以编译器不能这样做。
It's not "just" a quality-of-implementation issue.这不仅仅是一个实施质量问题。 A successful mutex lock is an acquire operation, but you should not look at the retry loop as a single acquire operation.
成功的互斥锁是一个获取操作,但您不应将重试循环视为单个获取操作。 Any sane compiler won't.
任何理智的编译器都不会。
(The classic example of something that aggressive atomics optimization could break is a progress bar, where the compiler sinks all the relaxed stores out of a loop and then folds all the dead stores into one final store of 100%. See also this Q&A - current compilers don't, and basically treat atomic
as volatile atomic
until C++ solves the problem of giving programmers a way to let the compiler know when atomics can/can't be optimized safely.) (激进的原子优化可能破坏的经典示例是进度条,其中编译器将所有松弛存储从循环中删除,然后将所有死存储折叠到一个 100% 的最终存储中。另见此问答- 当前编译器不会,并且基本上将
atomic
视为volatile atomic
,直到 C++ 解决了为程序员提供一种方法让编译器知道何时可以/不能安全地优化 atomic 的问题。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.