简体   繁体   English

如何在 epoll 上使用具有级别触发行为的 eventfd?

[英]How to use an eventfd with level triggered behaviour on epoll?

Registering a level triggered eventfd on epoll_ctl only fires once, when not decrementing the eventfd counter.在不递减 eventfd 计数器时,在epoll_ctl上注册一个级别触发的 eventfd 只会触发一次。 To summarize the problem, I have observed that the epoll flags ( EPOLLET , EPOLLONESHOT or None for level triggered behaviour) behave similar.总结一下这个问题,我观察到 epoll 标志( EPOLLETEPOLLONESHOTNone用于级别触发行为)表现相似。 Or in other words: Does not have an effect.或者换句话说:没有效果。

Could you confirm this bug?你能确认这个错误吗?

I have an application with multiple threads.我有一个具有多个线程的应用程序。 Each thread waits for new events with epoll_wait with the same epollfd.每个线程使用相同的 epollfd 使用epoll_wait等待新事件。 If you want to terminate the application gracefully, all threads have to be woken up.如果要优雅地终止应用程序,则必须唤醒所有线程。 My thought was that you use the eventfd counter ( EFD_SEMAPHORE|EFD_NONBLOCK ) for this (with level triggered epoll behavior) to wake up all together.我的想法是你使用 eventfd 计数器 ( EFD_SEMAPHORE|EFD_NONBLOCK ) 来唤醒所有这些(具有级别触发的 epoll 行为)。 (Regardless of the thundering herd problem for a small number of filedescriptors.) (不管少数文件描述符的雷鸣般的羊群问题。)

Eg for 4 threads you write 4 to the eventfd.例如,对于 4 个线程,您将 4 个写入 eventfd。 I was expecting epoll_wait returns immediately and again and again until the counter is decremented (read) 4 times.我期待epoll_wait立即一次又一次地返回,直到计数器递减(读取)4 次。 epoll_wait only returns once for every write. epoll_wait每次写入只返回一次。

Yep, I read all related manuals carefully;)是的,我仔细阅读了所有相关手册;)

#include <sys/epoll.h>
#include <sys/eventfd.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>

static int event_fd = -1;
static int epoll_fd = -1;

void *thread(void *arg)
{
    (void) arg;

    for(;;) {
       struct epoll_event event;
       epoll_wait(epoll_fd, &event, 1, -1);

       /* handle events */
       if(event.data.fd == event_fd && event.events & EPOLLIN) {
           uint64_t val = 0;
           eventfd_read(event_fd, &val);
           break;
       }
    }

    return NULL;
}

int main(void)
{
    epoll_fd = epoll_create1(0);
    event_fd = eventfd(0, EFD_SEMAPHORE| EFD_NONBLOCK);

    struct epoll_event event;
    event.events = EPOLLIN;
    event.data.fd = event_fd;
    epoll_ctl(epoll_fd, EPOLL_CTL_ADD, event_fd, &event);

    enum { THREADS = 4 };
    pthread_t thrd[THREADS];

    for (int i = 0; i < THREADS; i++)
        pthread_create(&thrd[i], NULL, &thread, NULL);

    /* let threads park internally (kernel does readiness check before sleeping) */
    usleep(100000);
    eventfd_write(event_fd, THREADS);

    for (int i = 0; i < THREADS; i++)
        pthread_join(thrd[i], NULL);
}

When you write to an eventfd , a function eventfd_signal is called.当您写入eventfd时,将调用 function eventfd_signal It contains the following line which does the wake up:它包含以下用于唤醒的行:

wake_up_locked_poll(&ctx->wqh, EPOLLIN);

With wake_up_locked_poll being a macro: wake_up_locked_poll是一个宏:

#define wake_up_locked_poll(x, m)                       \
    __wake_up_locked_key((x), TASK_NORMAL, poll_to_key(m))

With __wake_up_locked_key being defined as: __wake_up_locked_key被定义为:

void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key)
{
    __wake_up_common(wq_head, mode, 1, 0, key, NULL);
}

And finally, __wake_up_common is being declared as:最后, __wake_up_common被声明为:

/*
 * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
 * wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
 * number) then we wake all the non-exclusive tasks and one exclusive task.
 *
 * There are circumstances in which we can try to wake a task which has already
 * started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
 * zero in this (rare) case, and we handle it by continuing to scan the queue.
 */
static int __wake_up_common(struct wait_queue_head *wq_head, unsigned int mode,
            int nr_exclusive, int wake_flags, void *key,
            wait_queue_entry_t *bookmark)

Note the nr_exclusive argument and you will see that writing to an eventfd wakes only one exclusive waiter.请注意nr_exclusive参数,您会看到写入eventfd唤醒一个独占服务员。

What does exclusive mean?独家是什么意思? Reading epoll_ctl man page gives us some insight:阅读epoll_ctl手册页给了我们一些见解:

EPOLLEXCLUSIVE (since Linux 4.5): EPOLLEXCLUSIVE(自 Linux 4.5 起):

Sets an exclusive wakeup mode for the epoll file descriptor that is being attached to the target file descriptor, fd.为附加到目标文件描述符 fd 的 epoll 文件描述符设置独占唤醒模式。 When a wakeup event occurs and multiple epoll file descriptors are attached to the same target file using EPOLLEXCLUSIVE , one or more of the epoll file descriptors will receive an event with epoll_wait(2) .当发生唤醒事件并且使用EPOLLEXCLUSIVE将多个 epoll 文件描述符附加到同一个目标文件时,一个或多个 epoll 文件描述符将接收带有epoll_wait(2)的事件。

You do not use EPOLLEXCLUSIVE when adding your event, but to wait with epoll_wait every thread has to put itself to a wait queue.添加事件时不要使用EPOLLEXCLUSIVE ,但要使用epoll_wait等待,每个线程都必须将自己放入等待队列。 Function do_epoll_wait performs the wait by calling ep_poll . Function do_epoll_wait通过调用ep_poll来执行等待。 By following the code you can see that it adds the current thread to a wait queue at line #1903 :通过遵循代码,您可以看到它将当前线程添加到第 #1903 行的等待队列中

__add_wait_queue_exclusive(&ep->wq, &wait);

Which is the explanation for what is going on - epoll waiters are exclusive , so only a single thread is woken up.这是对正在发生的事情的解释 - epoll 服务员是独占的,所以只有一个线程被唤醒。 This behavior has been introduced in v2.6.22-rc1 and the relevant change has been discussed here .此行为已在v2.6.22-rc1中引入,相关更改已在此处讨论。

To me this looks like a bug in the eventfd_signal function: in semaphore mode it should perform a wake-up with nr_exclusive equal to the value written.对我来说,这看起来像是eventfd_signal function 中的一个错误:在信号量模式下,它应该执行唤醒,并且nr_exclusive等于写入的值。

So your options are:所以你的选择是:

  • Create a separate epoll descriptor for each thread (might not work with your design - scaling problems)为每个线程创建一个单独的 epoll 描述符(可能不适用于您的设计 - 缩放问题)
  • Put a mutex around it (scaling problems)在它周围放置一个互斥锁(缩放问题)
  • Use poll , probably on both eventfd and epoll使用poll ,可能在eventfd和 epoll 上
  • Wake each thread separately by writing 1 with evenfd_write 4 times (probably the best you can do).通过使用evenfd_write写入 1 来分别唤醒每个线程 4 次(可能是你能做的最好的)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM