简体繁体 English

TCP / IP - 使用每个客户端的线程方法解决C10K

[英]TCP/IP - Solving the C10K with the thread per client approach

原文 2013-07-11 12:42:20 9 3 multithreading/ concurrency/ tcp/ c10k

After reading the famous C10k article and searching on the web about how things have evolved since it was written, I would like to know if it would be possible for a today's standard server to handle >10000 concurrent connections using a thread per connection (possibly with the help of a pool of threads to avoid the creation/killing process). 在阅读着名的C10k文章并在网上搜索自编写之后事情如何演变之后，我想知道今天的标准服务器是否有可能使用每个连接的线程处理> 10000个并发 连接（可能与一个线程池的帮助，以避免创建/终止进程）。

Some details that may affect the approach to the problem: 一些可能影响问题解决方法的细节：

Input, intermediate processing and output. 输入，中间处理和输出。
Length of each connection. 每个连接的长度。
Technical specifications of the server (cores, processors, RAM, etc...) 服务器的技术规格（核心，处理器，RAM等......）
Combining this system with alternative techniques like AIO, polling, green threads, etc... 将此系统与AIO，轮询，绿色线程等替代技术相结合......

Obviously I'm not an expert in the matter, so any remarks or advices will be highly appreciated :) 显然我不是这方面的专家，所以任何评论或建议都将受到高度赞赏:)

3 个解决方案

Absolutely. 绝对。 A standard server can handle more than 10K concurrent connections using the model with one thread per connection . 标准服务器可以使用每个连接一个线程的模型处理超过10K的并发 连接。 I have build such an application, and five years ago, it was running with more than 50K concurrent connections per process on a standard Linux server. 我已经构建了这样一个应用程序，五年前，它在标准Linux服务器上运行时每个进程的并发连接数超过50K。 Nowadays, it should be possible to run the same application with more than 250K concurrent connections on current hardware. 如今，应该可以在当前硬件上运行具有超过250K并发连接的相同应用程序。

There are only a few things to keep in mind: 要记住的只有几件事：

Reuse threads by using a thread pool. 使用线程池重用线程。 There is no need to kill threads if they are not used, because the resource usage should be optimized for peak loads. 如果不使用线程，则无需终止线程，因为资源使用应针对峰值负载进行优化。
Stack size: By default each Linux thread reserves 8 MB for its stack. 堆栈大小：默认情况下，每个Linux线程为其堆栈保留8 MB。 That sums up to 80 GB for 10K threads. 对于10K线程，总计高达80 GB。 You should set the default stack size to some value between 64k and 512k, which isn't a problem, because most applications don't require deeper call stacks. 您应该将默认堆栈大小设置为64k到512k之间的某个值，这不是问题，因为大多数应用程序不需要更深的调用堆栈。
If the connections are short-lived, optimize for new connections by creating several sockets on the same endpoint with the option SO_REUSEPORT . 如果连接是短暂的，则通过使用选项SO_REUSEPORT在同一端点上创建多个套接字来优化新连接。
Increase the user limits: open files (default 1.024), max user processes 增加用户限制： open files （默认1.024）， max user processes
Increase system limits, eg /proc/sys/kernel/pid_max (default 32K), /proc/sys/kernel/threads-max , and /proc/sys/vm/max_map_count (default 65K). 增加系统限制，例如/proc/sys/kernel/pid_max （默认为32K）， /proc/sys/kernel/threads-max和/proc/sys/vm/max_map_count （默认为65K）。

The application mentioned above was initially designed to handle only 2K concurrent connections. 上面提到的应用程序最初设计为仅处理2K并发连接。 However, with the growth in use, we didn't have to make significant changes to the code in order to scale up to 50K connections. 但是，随着使用的增长，我们不必对代码进行重大更改，以便扩展到50K连接。

您可能希望最近关于这个主题的后续行动： 1000万个并发连接的秘密 - 内核是问题，而不是解决方案。

The usual approaches for servers are either: (a) thread per connection (often with a thread pool), or (b) single threaded with asynchronous IO (often with epoll or kqueue). 服务器的常用方法是：（a）每个连接的线程（通常使用线程池），或（b）使用异步IO的单线程（通常使用epoll或kqueue）。 My thinking is that some elements of these approaches can, and often should, be combined to use asynchronous IO (with epoll or kqueue) and then hand off the connection request to a thread pool to process. 我的想法是，这些方法的一些元素可以并且经常应该组合使用异步IO（使用epoll或kqueue），然后将连接请求移交给线程池进行处理。 This approach would combine the efficient dispatch of asynchronous IO with the parallelism provided by the thread pool. 这种方法将异步IO的有效分派与线程池提供的并行性结合起来。

I have written such a server for fun (in C++) that uses epoll on Linux and kqueue on FreeBSD and OSX along with a thread pool. 我编写了这样一个有趣的服务器（在C ++中），它在Linux上使用epoll，在FreeBSD和OSX上使用kqueue以及线程池。 I just need to run it through its paces for heavy testing, do some code cleanup, and then toss it out on github (hopefully soon). 我只需要通过它的步骤进行繁重的测试，做一些代码清理，然后把它扔到github上（希望很快）。