简体   繁体   中英

Does a spurious wake up unblock all waiting threads, even the unrelated ones?

I'm still new to multi-threading in C++ and I'm currently trying to wrap my head around "spurious wake-ups" and what's causing them. I've done some digging on condition variables, kernel signals, futex, etc., and found several culprits on why and how "spurious wake-ups" occur, but there is still something that I can't find the answer to...

Question: Will a spurious wake-up unblock all waiting/blocked threads, even the ones waiting for a completely unrelated notification? Or are there separate waiting queues for the blocked threads and therefore the threads waiting for another notification are protected?

Example: Let's say that we have 249 Spartans waiting to attack the Persians. They wait() for their leader, Leonidas (the 250th) to notify_all() when to attack. Now, on the other side of the camp there are 49 injured Spartans who are waiting for the doctor (the 50th) to notify_one() so that he could treat each one. Would a spurious wake-up unblock all waiting Spartans, including the injured ones, or would it only affect the ones waiting for battle? Are there two separate queues for the waiting threads, or just one for all?

Apologies if the example is misleading... I didn't know how else to explain it.

Causes for spurious wakeups are specific to each operating system, and so are the properties of such wakeups. In Linux, for example, a wakeup happens when a signal is delivered to a blocked thread. After executing the signal handler the thread does not block again and instead receives a special error code (usually EINTR ) from the system call that it was blocked on. Since signal handling does not involve other threads, they do not get woken up.

Note that spurious wakeup does not depend on the synchronization primitive you're blocking on or the number of threads blocked on that primitive. It may also happen with non-synchronization blocking system calls like read or write . In general, you have to assume that any blocking system call may return prematurely for whatever reason, unless it is guaranteed not to by a specification like POSIX (and even then, there may be bugs and OS specifics that deviate from the specification).

Some attribute superfluous notifications to spurious wakeups because dealing with both is usually the same. They are not the same, though. Unlike spurious wakeups, superfluous notifications are actually caused by another thread and are the result of a notify operation on the condition variable or futex. It's just the condition that you check upon the wakeup could turn to false before the unblocked thread manages to check it.

A spurious wakeup, in the context of a condition variable, is only from the waiters perspective. It means that the wait exited, but the condition is not true; thus the idiomatic use is:

Thing.lock()
 while Thing.state != Play {
     Thing.wait()
 }
 ....
 Thing.unlock()

Each iteration of this loop but one, would be considered spurious. Why they occur:

  1. Many conditions are being multiplexed onto a single condition variable; sometimes this is appropriate, sometimes it is just lazy
  2. A waiting thread beat your thread to the condition, and has changed its state before you get a chance to own it.
  3. Unrelated events, such as kill(2) handling do this to ensure consistency after asynchronous handlers have run.

The most important thing is to verify that the desired condition has been met, and retry or abandon if not. It is a common error to not recheck the condition which can be very difficult to diagnose.

As a more serious example should illustrate:

int q_next(Q *q, int idx) {
/* return the q index succeeding this, with wrap */
   if (idx + 1 == q->len) {
       return 0
   } else {
       return idx + 1
   }
}
void q_get(Q *q, Item *p) {
    Lock(q)
    while (q->head == q->tail) {
         Wait(q)
    }
    *p = q->data[q->tail]

    if (q_next(q, q->head) == q->tail) {
        /* q was full, now has space */
        Broadcast(q)
    }
    q->tail = q_next(q, q->tail)
    Unlock(q)
}
void q_put(Q *q, Item *p) {
    Lock(q)
    while (q_next(q, q->head) == q->tail) {
         Wait(q)
    }
    q->data[q->head] = *p
    if (q->head == q->tail) {
        /* q was empty, data available */
        Broadcast(q)
    }
    q->head = q_next(q, q->head)
    Unlock(q)
}

This is a multi-reader, multi-writer queue. Writers wait until there is space in the queue, put the item in, and if the queue was previously empty, broadcast to indicate there is now data. Readers wait until there is something in the queue, take the item from the queue, and if the queue was previously full, broadcast to indicate there is now space.

Note the condition variable is being used for two conditions {not full, not empty}. These are edge-triggered conditions: only the transition from full and from empty are signaled.

Q_get and q_put protect themselves from spurious wakeups caused by both [1] and [2], and you can readily instrument the code to show how often this happens.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM