虚假唤醒是否会解除所有等待线程的阻塞，甚至是不相关的线程？

Question

I'm still new to multi-threading in C++ and I'm currently trying to wrap my head around "spurious wake-ups" and what's causing them.我还是 C++ 多线程的新手，我目前正试图围绕“虚假唤醒”以及导致它们的原因。 I've done some digging on condition variables, kernel signals, futex, etc., and found several culprits on why and how "spurious wake-ups" occur, but there is still something that I can't find the answer to...我已经对条件变量、内核信号、futex 等进行了一些挖掘，并发现了几个关于“虚假唤醒”发生的原因和方式的罪魁祸首，但仍有一些我无法找到答案的问题。 .

Question: Will a spurious wake-up unblock all waiting/blocked threads, even the ones waiting for a completely unrelated notification?问题：虚假唤醒会解除所有等待/阻塞线程的阻塞，甚至是等待完全无关通知的线程吗？ Or are there separate waiting queues for the blocked threads and therefore the threads waiting for another notification are protected?或者被阻塞的线程是否有单独的等待队列，因此等待另一个通知的线程受到保护？

Example: Let's say that we have 249 Spartans waiting to attack the Persians.示例：假设我们有 249 名斯巴达人等待进攻波斯人。 They wait() for their leader, Leonidas (the 250th) to notify_all() when to attack.他们wait()他们的领袖Leonidas（第250 位）通知notify_all()何时发起攻击。 Now, on the other side of the camp there are 49 injured Spartans who are waiting for the doctor (the 50th) to notify_one() so that he could treat each one.现在，在营地的另一边，有 49 名受伤的斯巴达人正在等待医生（第 50 名）通知notify_one()以便他可以治疗每个人。 Would a spurious wake-up unblock all waiting Spartans, including the injured ones, or would it only affect the ones waiting for battle?虚假的唤醒会解除所有等待的斯巴达人，包括受伤的人，还是只会影响等待战斗的人？ Are there two separate queues for the waiting threads, or just one for all?等待线程有两个单独的队列，还是只有一个队列？

Apologies if the example is misleading... I didn't know how else to explain it.如果该示例具有误导性，请道歉......我不知道如何解释它。

Answer 1

Causes for spurious wakeups are specific to each operating system, and so are the properties of such wakeups.虚假唤醒的原因因每个操作系统而异，此类唤醒的属性也是如此。 In Linux, for example, a wakeup happens when a signal is delivered to a blocked thread.例如，在 Linux 中，当一个信号被传递到一个被阻塞的线程时就会发生唤醒。 After executing the signal handler the thread does not block again and instead receives a special error code (usually EINTR ) from the system call that it was blocked on.执行信号处理程序后，线程不会再次阻塞，而是从它被阻塞的系统调用接收一个特殊的错误代码（通常是EINTR ）。 Since signal handling does not involve other threads, they do not get woken up.由于信号处理不涉及其他线程，因此它们不会被唤醒。

Note that spurious wakeup does not depend on the synchronization primitive you're blocking on or the number of threads blocked on that primitive.请注意，虚假唤醒不取决于您阻塞的同步原语或该原语上阻塞的线程数。 It may also happen with non-synchronization blocking system calls like read or write .它也可能发生在非同步阻塞系统调用中，如read或write 。 In general, you have to assume that any blocking system call may return prematurely for whatever reason, unless it is guaranteed not to by a specification like POSIX (and even then, there may be bugs and OS specifics that deviate from the specification).通常，您必须假设任何阻塞系统调用都可能因任何原因过早返回，除非像 POSIX 这样的规范保证它不会返回（即使如此，也可能存在偏离规范的错误和操作系统细节）。

Some attribute superfluous notifications to spurious wakeups because dealing with both is usually the same.有些人将多余的通知归因于虚假唤醒，因为处理两者通常是相同的。 They are not the same, though.但是，它们并不相同。 Unlike spurious wakeups, superfluous notifications are actually caused by another thread and are the result of a notify operation on the condition variable or futex.与虚假唤醒不同，多余的通知实际上是由另一个线程引起的，是对条件变量或 futex 进行通知操作的结果。 It's just the condition that you check upon the wakeup could turn to false before the unblocked thread manages to check it.这只是您检查唤醒的条件可能会在未阻塞的线程设法检查它之前变为 false。

Answer 2

A spurious wakeup, in the context of a condition variable, is only from the waiters perspective.在条件变量的上下文中，虚假唤醒只是从服务员的角度来看。 It means that the wait exited, but the condition is not true;表示等待退出，但条件不成立； thus the idiomatic use is:因此惯用的用法是：

Thing.lock()
 while Thing.state != Play {
     Thing.wait()
 }
 ....
 Thing.unlock()

Each iteration of this loop but one, would be considered spurious.这个循环的每一次迭代都被认为是虚假的。 Why they occur:它们出现的原因：

Many conditions are being multiplexed onto a single condition variable;许多条件被多路复用到一个条件变量上； sometimes this is appropriate, sometimes it is just lazy有时这是合适的，有时只是懒惰
A waiting thread beat your thread to the condition, and has changed its state before you get a chance to own it.一个等待的线程击败了你的线程，并在你有机会拥有它之前改变了它的状态。
Unrelated events, such as kill(2) handling do this to ensure consistency after asynchronous handlers have run.不相关的事件，例如 kill(2) 处理，这样做是为了确保异步处理程序运行后的一致性。

The most important thing is to verify that the desired condition has been met, and retry or abandon if not.最重要的是验证是否满足了所需的条件，如果不满足则重试或放弃。 It is a common error to not recheck the condition which can be very difficult to diagnose.不重新检查可能很难诊断的情况是一个常见的错误。

As a more serious example should illustrate:作为一个更严重的例子应该说明：

int q_next(Q *q, int idx) {
/* return the q index succeeding this, with wrap */
   if (idx + 1 == q->len) {
       return 0
   } else {
       return idx + 1
   }
}
void q_get(Q *q, Item *p) {
    Lock(q)
    while (q->head == q->tail) {
         Wait(q)
    }
    *p = q->data[q->tail]

    if (q_next(q, q->head) == q->tail) {
        /* q was full, now has space */
        Broadcast(q)
    }
    q->tail = q_next(q, q->tail)
    Unlock(q)
}
void q_put(Q *q, Item *p) {
    Lock(q)
    while (q_next(q, q->head) == q->tail) {
         Wait(q)
    }
    q->data[q->head] = *p
    if (q->head == q->tail) {
        /* q was empty, data available */
        Broadcast(q)
    }
    q->head = q_next(q, q->head)
    Unlock(q)
}

This is a multi-reader, multi-writer queue.这是一个多读多写队列。 Writers wait until there is space in the queue, put the item in, and if the queue was previously empty, broadcast to indicate there is now data.写入者等到队列中有空间时，将项目放入，如果队列之前为空，则广播以指示现在有数据。 Readers wait until there is something in the queue, take the item from the queue, and if the queue was previously full, broadcast to indicate there is now space.读者等到队列中有东西时，从队列中取出项目，如果队列之前已满，则广播以指示现在有空间。

Note the condition variable is being used for two conditions {not full, not empty}.请注意，条件变量用于两个条件 {not full, not empty}。 These are edge-triggered conditions: only the transition from full and from empty are signaled.这些是边沿触发条件：只有从满和从空的转换才会发出信号。

Q_get and q_put protect themselves from spurious wakeups caused by both [1] and [2], and you can readily instrument the code to show how often this happens. Q_get 和 q_put 保护自己免受由 [1] 和 [2] 引起的虚假唤醒，您可以轻松地检测代码以显示这种情况发生的频率。

虚假唤醒是否会解除所有等待线程的阻塞，甚至是不相关的线程？

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-02-09 11:40:34

解决方案2
2 2020-02-09 15:55:17

虚假唤醒是否会解除所有等待线程的阻塞，甚至是不相关的线程？

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-02-09 11:40:34

解决方案2 2 2020-02-09 15:55:17

解决方案1
2 已采纳 2020-02-09 11:40:34

解决方案2
2 2020-02-09 15:55:17