使用epoll_wait时如何正确读取数据

Question

I am trying to port to Linux an existing Windows C++ code that uses IOCP. 我正在尝试将使用IOCP的现有Windows C ++代码移植到Linux。 Having decided to use epoll_wait to achieve high concurrency, I am already faced with a theoretical issue of when we try to process received data. 决定使用epoll_wait来实现高并发性，我已经面临一个理论问题，即我们何时尝试处理接收到的数据。

Imagine two threads calling epoll_wait , and two consequetives messages being received such that Linux unblocks the first thread and soon the second. 想象一下两个调用epoll_wait线程，并收到两个确认消息，以便Linux解锁第一个线程，很快就会epoll_wait第二个线程。

Example : 示例：

Thread 1 blocks on epoll_wait
Thread 2 blocks on epoll_wait
Client sends a chunk of data 1
Thread 1 deblocks from epoll_wait, performs recv and tries to process data
Client sends a chunk of data 2
Thread 2 deblocks, performs recv and tries to process data.

Is this scenario conceivable ? 这种情况可以想象吗？ Ie can it occure ? 即它可以发生？

Is there a way to prevent it so to avoid implementing synchronization in the recv/processing code ? 有没有办法防止它，以避免在recv /处理代码中实现同步？

Answer 1

If you have multiple threads reading from the same set of epoll handles, I would recommend you put your epoll handles in one-shot level-triggered mode with EPOLLONESHOT . 如果您有多个线程从同一组epoll句柄读取，我建议您使用EPOLLONESHOT将您的epoll句柄置于单次触发级别触发模式。 This will ensure that, after one thread observes the triggered handle, no other thread will observe it until you use epoll_ctl to re-arm the handle. 这将确保在一个线程观察到触发的句柄之后，在您使用epoll_ctl重新设置句柄之前，没有其他线程会观察到它。

If you need to handle read and write paths independently, you may want to completely split up the read and write thread pools; 如果需要独立处理读写路径，可能需要完全拆分读写线程池; have one epoll handle for read events, and one for write events, and assign threads to one or the other exclusively. 有一个epoll句柄用于读取事件，一个用于写入事件，并将线程分配给一个或另一个。 Further, have a separate lock for read and for write paths. 此外，还有一个用于读取和写入路径的单独锁定。 You must be careful about interactions between the read and write threads as far as modifying any per-socket state, of course. 当然，在修改任何每个套接字状态时，必须注意读写线程之间的交互。

If you do go with that split approach, you need to put some thought into how to handle socket closures. 如果你采用这种拆分方法，你需要考虑如何处理套接字闭包。 Most likely you will want an additional shared-data lock, and 'acknowledge closure' flags, set under the shared data lock, for both read and write paths. 您很可能需要一个额外的共享数据锁，以及在共享数据锁下设置的“确认闭包”标志，用于读取和写入路径。 Read and write threads can then race to acknowledge, and the last one to acknowledge gets to clean up the shared data structures. 然后，读写线程可以竞争确认，最后一个确认可以清理共享数据结构。 That is, something like this: 就是这样的：

void OnSocketClosed(shareddatastructure *pShared, int writer)
{
  epoll_ctl(myepollhandle, EPOLL_CTL_DEL, pShared->fd, NULL);
  LOCK(pShared->common_lock);
  if (writer)
    pShared->close_ack_w = true;
  else
    pShared->close_ack_r = true;

  bool acked = pShared->close_ack_w && pShared->close_ack_r;
  UNLOCK(pShared->common_lock);

  if (acked)
    free(pShared);
}

Answer 2

I'm assuming here that the situation you're trying to process is something like this: 我在这里假设您尝试处理的情况是这样的：

You have multiple (maybe very many) sockets that you want to receive data from at once; 您有多个（可能很多）套接字要同时接收数据;

You want to start processing data from the first connection on Thread A when it is first received and then be sure that data from this connection is not processed on any other thread until you have finished with it in Thread A. 您希望在第一次接收时从线程A上的第一个连接开始处理数据，然后确保来自此连接的数据不会在任何其他线程上处理，直到您在线程A中完成它为止。

While you are doing that, if some data is now received on a different connection you want Thread B to pick that data and process it while still being sure that no one else can process this connection until Thread B is done with it etc. 当您这样做时，如果现在在不同的连接上接收到某些数据，您希望线程B选择该数据并对其进行处理，同时仍然确保没有其他人可以处理此连接，直到线程B完成它等。

Under these circumstances it turns out that using epoll_wait() with the same epoll fd in multiple threads is a reasonably efficient approach (I'm not claiming that it is necessarily the most efficient). 在这种情况下，事实证明在多个线程中使用epoll_wait（）和相同的epoll fd是一种相当有效的方法（我并不是说它必然是最有效的）。

The trick here is to add the individual connections fds to the epoll fd with the EPOLLONESHOT flag. 这里的技巧是使用EPOLLONESHOT标志将各个连接fds添加到epoll fd。 This ensures that once an fd has been returned from an epoll_wait() it is unmonitored until you specifically tell epoll to monitor it again. 这确保了一旦从epoll_wait（）返回fd，它就会被监视，直到您明确告诉epoll再次监视它。 This ensures that the thread processing this connection suffers no interference as no other thread can be processing the same connection until this thread marks the connection to be monitored again. 这确保处理此连接的线程不会受到干扰，因为在此线程标记要再次监视的连接之前，其他线程无法处理相同的连接。

You can set up the fd to monitor EPOLLIN or EPOLLOUT again using epoll_ctl() and EPOLL_CTL_MOD. 您可以使用epoll_ctl（）和EPOLL_CTL_MOD设置fd以再次监视EPOLLIN或EPOLLOUT。

A significant benefit of using epoll like this in multiple threads is that when one thread is finished with a connection and adds it back to the epoll monitored set, any other threads still in epoll_wait() are immediately monitoring it even before the previous processing thread returns to epoll_wait(). 在多个线程中使用这样的epoll的一个重要好处是，当一个线程完成连接并将其添加回epoll监视集时，仍然在epoll_wait（）中的任何其他线程甚至在前一个处理线程返回之前立即监视它到epoll_wait（）。 Incidentally that could also be a disadvantage because of lack of cache data locality if a different thread now picks up that connection immediately (thus needing to fetch the data structures for this connection and flush the previous thread's cache). 顺便提一下，如果不同的线程现在立即获取该连接（因此需要获取此连接的数据结构并刷新先前线程的缓存），这也可能是缺点，因为缺少缓存数据局部性。 What works best will sensitively depend on your exact usage pattern. 什么最有效将敏感地取决于您的确切使用模式。

If you are trying to process messages received subsequently on the same connection in different threads then this scheme to use epoll is not going to be appropriate for you, and an approach using a listening thread feeding an efficient queue feeding worker threads might be better. 如果您正在尝试处理随后在不同线程中的相同连接上接收的消息，那么使用epoll的这种方案将不适合您，并且使用监听线程提供有效队列馈送工作线程的方法可能会更好。

Answer 3

Previous answers that point out that calling epoll_wait() from multiple threads is a bad idea are almost certainly right, but I was intrigued enough by the question to try and work out what does happen when it is called from multiple threads on the same handle, waiting for the same socket. 以前的答案指出从多个线程调用epoll_wait（）几乎肯定是正确的，但我对这个问题很感兴趣，试图找出从同一个句柄上的多个线程调用时会发生什么，等待同一个套接字。 I wrote the following test code: 我写了以下测试代码：

#include <netinet/in.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <unistd.h>

struct thread_info {
  int number;
  int socket;
  int epoll;
};

void * thread(struct thread_info * arg)
{
    struct epoll_event events[10];
    int s;
    char buf[512];

    sleep(5 * arg->number);
    printf("Thread %d start\n", arg->number);

    do {
        s = epoll_wait(arg->epoll, events, 10, -1);

        if (s < 0) {
            perror("wait");
            exit(1);
        } else if (s == 0) {
            printf("Thread %d No data\n", arg->number);
            exit(1);
        }
        if (recv(arg->socket, buf, 512, 0) <= 0) {
            perror("recv");
            exit(1);
        }
        printf("Thread %d got data\n", arg->number);
    } while (s == 1);

    printf("Thread %d end\n", arg->number);

    return 0;
}

int main()
{
    pthread_attr_t attr;
    pthread_t threads[2];
    struct thread_info thread_data[2];
    int s;
    int listener, client, epollfd;
    struct sockaddr_in listen_address;
    struct sockaddr_storage client_address;
    socklen_t client_address_len;
    struct epoll_event ev;

    listener = socket(AF_INET, SOCK_STREAM, 0);

    if (listener < 0) {
        perror("socket");
        exit(1);
    }

    memset(&listen_address, 0, sizeof(struct sockaddr_in));
    listen_address.sin_family = AF_INET;
    listen_address.sin_addr.s_addr = INADDR_ANY;
    listen_address.sin_port = htons(6799);

    s = bind(listener,
             (struct sockaddr*)&listen_address,
             sizeof(listen_address));

    if (s != 0) {
        perror("bind");
        exit(1);
    }

    s = listen(listener, 1);

    if (s != 0) {
        perror("listen");
        exit(1);
    }

    client_address_len = sizeof(client_address);
    client = accept(listener,
                    (struct sockaddr*)&client_address,
                    &client_address_len);

    epollfd = epoll_create(10);
    if (epollfd == -1) {
        perror("epoll_create");
        exit(1);
    }

    ev.events = EPOLLIN;
    ev.data.fd = client;
    if (epoll_ctl(epollfd, EPOLL_CTL_ADD, client, &ev) == -1) {
        perror("epoll_ctl: listen_sock");
        exit(1);
    }

    thread_data[0].number = 0;
    thread_data[1].number = 1;
    thread_data[0].socket = client;
    thread_data[1].socket = client;
    thread_data[0].epoll = epollfd;
    thread_data[1].epoll = epollfd;

    s = pthread_attr_init(&attr);
    if (s != 0) {
        perror("pthread_attr_init");
        exit(1);
    }

    s = pthread_create(&threads[0],
                       &attr,
                       (void*(*)(void*))&thread,
                       &thread_data[0]);

    if (s != 0) {
        perror("pthread_create");
        exit(1);
    }

    s = pthread_create(&threads[1],
                       &attr,
                       (void*(*)(void*))&thread,
                       &thread_data[1]);

    if (s != 0) {
        perror("pthread_create");
        exit(1);
    }

    pthread_join(threads[0], 0);
    pthread_join(threads[1], 0);

    return 0;
}

When data arrives, and both threads are waiting on epoll_wait(), only one will return, but as subsequent data arrives, the thread that wakes up to handle the data is effectively random between the two threads. 当数据到达，并且两个线程都在epoll_wait（）上等待时，只有一个将返回，但随着后续数据到达，唤醒处理数据的线程在两个线程之间实际上是随机的。 I wasn't able to to find a way to affect which thread was woken. 我无法找到影响哪个线程被唤醒的方法。

It seems likely that a single thread calling epoll_wait makes most sense, with events passed to worker threads to pump the IO. 调用epoll_wait的单个线程似乎最有意义，事件传递给工作线程以泵送IO。

Answer 4

I believe that the high performance software that uses epoll and a thread per core creates multiple epoll handles that each handle a subset of all the connections. 我相信使用epoll的高性能软件和每个核心的线程会创建多个epoll句柄，每个句柄处理所有连接的子集。 In this way the work is divided but the problem you describe is avoided. 通过这种方式，工作可以分开，但避免了您描述的问题。

Answer 5

Generally, epoll is used when you have a single thread listening for data on a single asynchronous source. 通常，当您有一个线程侦听单个异步源上的数据时，会使用epoll 。 To avoid busy-waiting (manually polling), you use epoll to let you know when data is ready (much like select does). 为避免忙碌等待（手动轮询），您可以使用epoll告知数据何时准备就绪（非常类似于select ）。

It is not standard practice to have multiple threads reading from a single data source, and I, at least, would consider it bad practice. 从单个数据源读取多个线程并不是标准做法，至少我认为这是不好的做法。

If you want to use multiple threads, but you only have one input source, then designate one of the threads to listen and queue the data so the other threads can read individual pieces from the queue. 如果要使用多个线程，但只有一个输入源，则指定其中一个线程来监听和排队数据，以便其他线程可以读取队列中的各个部分。

使用epoll_wait时如何正确读取数据

问题描述

5 个解决方案

解决方案1
5 已采纳 2011-04-04 17:31:58

解决方案2
3 2011-04-04 17:32:46

解决方案3
2 2011-04-04 20:51:46

解决方案4
1 2011-04-04 16:50:54

解决方案5
0 2011-04-04 16:47:51

使用epoll_wait时如何正确读取数据

问题描述

5 个解决方案

解决方案1 5 已采纳 2011-04-04 17:31:58

解决方案2 3 2011-04-04 17:32:46

解决方案3 2 2011-04-04 20:51:46

解决方案4 1 2011-04-04 16:50:54

解决方案5 0 2011-04-04 16:47:51

解决方案1
5 已采纳 2011-04-04 17:31:58

解决方案2
3 2011-04-04 17:32:46

解决方案3
2 2011-04-04 20:51:46

解决方案4
1 2011-04-04 16:50:54

解决方案5
0 2011-04-04 16:47:51