sem_wait() failed to wake up on linux

Question

I have a real-time application that uses a shared FIFO. There are several writer processes and one reader process. Data is periodically written into the FIFO and constantly drained. Theoretically the FIFO should never overflow because the reading speed is faster than all writers combined. However, the FIFO does overflow.

I tried to reproduce the problem and finally worked out the following (simplified) code:

#include <stdint.h>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <cassert>
#include <pthread.h>
#include <semaphore.h>
#include <sys/time.h>
#include <unistd.h>


class Fifo
{
public:
    Fifo() : _deq(0), _wptr(0), _rptr(0), _lock(0)
    {
        memset(_data, 0, sizeof(_data));
        sem_init(&_data_avail, 1, 0);
    }

    ~Fifo()
    {
        sem_destroy(&_data_avail);
    }

    void Enqueue()
    {
        struct timeval tv;
        gettimeofday(&tv, NULL);
        uint64_t enq = tv.tv_usec + tv.tv_sec * 1000000;
        while (__sync_lock_test_and_set(&_lock, 1))
            sched_yield();
        uint8_t wptr = _wptr;
        uint8_t next_wptr = (wptr + 1) % c_entries;
        int retry = 0;
        while (next_wptr == _rptr)      // will become full
        {
            printf("retry=%u enq=%lu deq=%lu count=%d\n", retry, enq, _deq, Count());
            for (uint8_t i = _rptr; i != _wptr; i = (i+1)%c_entries)
                printf("%u: %lu\n", i, _data[i]);
            assert(retry++ < 2);
            usleep(500);
        }
        assert(__sync_bool_compare_and_swap(&_wptr, wptr, next_wptr));
        _data[wptr] = enq;
        __sync_lock_release(&_lock);
        sem_post(&_data_avail);
    }

    int Dequeue()
    {
        struct timeval tv;
        gettimeofday(&tv, NULL);
        uint64_t deq = tv.tv_usec + tv.tv_sec * 1000000;
        _deq = deq;
        uint8_t rptr = _rptr, wptr = _wptr;
        uint8_t next_rptr = (rptr + 1) % c_entries;
        bool empty = Count() == 0;
        assert(!sem_wait(&_data_avail));// bug in sem_wait?
        _deq = 0;
        uint64_t enq = _data[rptr];     // enqueue time
        assert(__sync_bool_compare_and_swap(&_rptr, rptr, next_rptr));
        int latency = deq - enq;        // latency from enqueue to dequeue
        if (empty && latency < -500)
        {
            printf("before dequeue: w=%u r=%u; after dequeue: w=%u r=%u; %d\n", wptr, rptr, _wptr, _rptr, latency);
        }
        return latency;
    }

    int Count()
    {
        int count = 0;
        assert(!sem_getvalue(&_data_avail, &count));
        return count;
    }

    static const unsigned c_entries = 16;

private:
    sem_t _data_avail;
    uint64_t _data[c_entries];
    volatile uint64_t _deq;     // non-0 indicates when dequeue happened
    volatile uint8_t _wptr, _rptr;      // write, read pointers
    volatile uint8_t _lock;     // write lock
};


static const unsigned c_total = 10000000;
static const unsigned c_writers = 3;

static Fifo s_fifo;


// writer thread
void* Writer(void* arg)
{
    for (unsigned i = 0; i < c_total; i++)
    {
        int t = rand() % 200 + 200;     // [200, 399]
        usleep(t);
        s_fifo.Enqueue();
    }
    return NULL;
}

int main()
{
    pthread_t thread[c_writers];
    for (unsigned i = 0; i < c_writers; i++)
        pthread_create(&thread[i], NULL, Writer, NULL);

    for (unsigned total = 0; total < c_total*c_writers; total++)
        s_fifo.Dequeue();
}

When Enqueue() overflows, the debug print indicates that Dequeue() is stuck (because _deq is not 0). The only place where Dequeue() can get stuck is sem_wait(). However, since the fifo is full (also confirmed by sem_getvalue()), I don't understand how that could happen. Even after several retries (each waits 500us) the fifo was still full even though Dequeue() should definitely drain while Enqueue() is completely stopped (busy retrying).

In the code example, there are 3 writers, each writing every 200-400us. On my computer (8-core i7-2860 running centOS 6.5 kernel 2.6.32-279.22.1.el6.x86_64, g++ 4.47 20120313), the code would fail in a few minutes. I also tried on several other centOS systems and it also failed the same way.

I know that making the fifo bigger can reduce overflow probability (in fact, the program still fails with c_entries=128), but in my real-time application there is hard constraint on enqueue-dequeue latency, so data must be drained quickly. If it's not a bug in sem_wait(), then what prevents it from getting the semaphore?

PS If I replace

        assert(!sem_wait(&_data_avail));// bug in sem_wait?

with

        while (sem_trywait(&_data_avail) < 0) sched_yield();

then the program runs fine. So it seems that there's something wrong in sem_wait() and/or scheduler.

Answer 1

You need to use a combination of sem_wait/sem_post calls to be able to manage your read and write threads.

Your enqueue thread performs a sem_post only and your dequeue performs sem_wait only call. you need to add sem_wait to the enqueue thread and a sem_post on the dequeue thread.

A long time ago, I implemented the ability to have multiple threads/process be able to read some shared memory and only one thread/process write to the shared memory. I used two semaphore, a write semaphore and a read semaphore. The read threads would wait until the write semaphore was not set and then it would set the read semaphore. The write threads would set the write semaphore and then wait until the read semaphore is not set. The read and write threads would then unset the set semaphores when they've completed their tasks. The read semaphore can have n threads lock the read semaphore at a time while the write semaphore can be lock by a single thread at a time.

Answer 2

If it's not a bug in sem_wait(), then what prevents it from getting the semaphore?

Your program's impatience prevents it. There is no guarantee that the Dequeue() thread is scheduled within a given number of retries. If you change

            assert(retry++ < 2);

to

            retry++;

you'll see that the program happily continues the reader process sometimes after 8 or perhaps even more retries.

Why does Enqueue have to retry?

It has to retry simply because the main thread's Dequeue() hasn't been scheduled by then.

Dequeue speed is much faster than all writers combined.

Your program shows that this assumption is sometimes false. While apparently the execution time of Dequeue() is much shorter than that of the writers (due to the usleep(t) ), this does not imply that Dequeue() is scheduled by the Completely Fair Scheduler more often - and for this the main reason is that you used a nondeterministic scheduling policy. man sched_yield :

() is intended for use with read-time scheduling policies
       (i.e., or ).  Use of () with
       nondeterministic scheduling policies such as is
       unspecified and very likely means your application design is broken.

If you insert

    struct sched_param param = { .sched_priority = 1 };
    if (sched_setscheduler(0, SCHED_FIFO, &param) < 0)
        perror("sched_setscheduler");

at the start of main() , you'll likely see that your program performs as expected (when run with the appropriate priviledge).

sem_wait() failed to wake up on linux

Question

2 answers

solution1
0 2014-12-19 02:49:46

solution2
0 2017-02-15 10:27:09

sem_wait() failed to wake up on linux

Question

2 answers

solution1 0 2014-12-19 02:49:46

solution2 0 2017-02-15 10:27:09

solution1
0 2014-12-19 02:49:46

solution2
0 2017-02-15 10:27:09