简体繁体 English

C：epoll和多线程

[英]C: epoll and multithreading

原文 2011-01-14 03:08:37 9 3 c/ architecture/ epoll/ high-load

I need to create specialized HTTP server, for this I plan to use epoll sycall, but I want to utilize multiple processors/cores and I can't come up with architecture solution. 我需要创建专门的HTTP服务器，为此我计划使用epoll sycall，但我想利用多个处理器/核心，我无法提出架构解决方案。 ATM my idea is followng: create multiple threads with own epoll descriptors, main thread accepts connections and distributes them among threads epoll. ATM我的想法是跟随：用自己的epoll描述符创建多个线程，主线程接受连接并在线程epoll之间分配它们。 But are there any better solutions? 但有更好的解决方案吗？ Which books/articles/guides can I read on high load architectures? 我可以在高负载架构上阅读哪些书籍/文章/指南？ I've seen only C10K article, but most links to examples are dead :( and still no in-depth books on this subject :(. 我只看过C10K文章，但大多数链接到例子已经死了:(并且仍然没有关于这个主题的深入书籍:(。

Thank you for answers. 谢谢你的回答。

UPD: Please be more specific, I need materials and examples (nginx is not an example because its too complex and has multiple abstraction layers to support multiple systems). UPD：请更具体，我需要材料和示例（nginx不是一个例子，因为它太复杂，并且有多个抽象层来支持多个系统）。

3 个解决方案

..my idea is followng: create multiple threads with own epoll descriptors, main thread accepts connections and distributes them among threads epoll. 我的想法是跟随：用自己的epoll描述符创建多个线程，主线程接受连接并在线程epoll之间分配它们。

Yes that's currently the best way to do this and it's how Nginx does it. 是的，这是目前最好的方法，它是Nginx如何做到的。 The number of threads can be increased or decreased depending on load and/or the number of physical cores on the machine. 可以根据负载和/或机器上的物理核心数量来增加或减少线程数。

The trade-off between extra threads (more than the number of physical cores) and events is one of latency and throughput. 额外线程（超过物理内核数量）和事件之间的权衡是延迟和吞吐量之一。 Threads improve latency because they can execute pre-emptively but at the expense of throughput due to overhead incurred by context switching and thread creation/deletion. 线程可以提高延迟，因为它们可以先发制人地执行，但是由于上下文切换和线程创建/删除引起的开销而导致吞吐量损失。 Events improve throughput but has the disadvantage that long-running code causes the entire thread to halt. 事件提高了吞吐量，但缺点是长时间运行的代码会导致整个线程停止。

The second best is how Apache2 does it using a thread pool of blocking threads. 第二个最好的是Apache2如何使用阻塞线程的线程池来完成它。 No event processing here so the implementation is simpler and the pool means threads are not created and destroyed unnecessarily but it can't really compete with a well implemented thread/asynchronous hybrid like what you're trying to implement or Nginx. 这里没有事件处理，所以实现更简单，池意味着线程不会被不必要地创建和销毁，但它不能真正与你正在尝试实现或Nginx实现的良好实现的线程/异步混合竞争。

The third best is asynchronous event processing alone like Lighttpd or Node.js. 第三个是单独的异步事件处理，如Lighttpd或Node.js. Well, it's the second best if you're not doing heavy processing in the server. 嗯，如果您没有在服务器中进行繁重的处理，那么它是第二好的。 But as mentioned earlier, a single long-running while loop blocks the entire server. 但如前所述，单个长时间运行的while循环会阻塞整个服务器。

check libevent and libev sources. 检查libevent和libev来源。 they're highly readable, and already a good infrastructure to use. 它们具有高可读性，并且已经是一个很好的基础设施。

Also, libev's documentation has plenty of examples of several tried and true strategies. 另外，libev的文档中有很多尝试和真实策略的例子。 Even if you prefer to write directly to epoll() , the examples can lead to several insights. 即使您更喜欢直接写入epoll() ，这些示例也可以带来一些见解。

Unless you have a terabit uplink and plan to service 10000 simultaneous connections off a single server, forget about epoll . 除非您有太比特上行链路并且计划在单个服务器上同时为10000个连接提供服务，否则请忘记epoll 。 It's just gratuitous non-portability; 这只是无偿的不可移植性; poll or even select will do just as well. poll甚至select也一样。 Keep in mind that by the time terabit uplinks and such are standard, your server will also be sufficiently faster that you still won't need epoll . 请记住，在太比特上行链路等标准时，您的服务器也将足够快，您仍然不需要epoll 。

If you're just serving static content, forget about threads too and use the Linux sendfile syscall. 如果您只是提供静态内容，请忘记线程并使用Linux sendfile系统调用。 This too is nonstandard, but at least it offers huge real-world performance benefits. 这也是非标准的，但至少它提供了巨大的实际性能优势。

Also note that other design decisions (especially excess complexity) will be much more of a factor in how much load your server can handle. 另请注意，其他设计决策（尤其是过多的复杂性）将更多地影响服务器可以处理的负载量。 For an example, just look how the modest single-threaded, single-process thttpd blows away Apache and friends in performance on static content -- and in my experience, even on traditional cgi dynamic content! 举个例子，看看适度的单线程，单进程thttpd如何在静态内容上吹嘘Apache和朋友的表现 - 根据我的经验，即使是传统的cgi动态内容！