简体繁体 English

nginx：它的多线程但使用多个进程？

[英]nginx : Its Multithreaded but uses multiple processes?

原文 2011-01-21 22:59:42 6 3 apache/ architecture/ nginx

I'm trying to understand what makes Nginx so fast, and I have a few questions. 我试图理解是什么让Nginx这么快，我有几个问题。

As I understand it, Apache either spawns a new process to serve each request OR spawns a new thread to serve each request. 据我所知，Apache会生成一个新进程来为每个请求提供服务，或者生成一个新线程来为每个请求提供服务。 Since each new thread shares virtual address space the memory usage keeps climbs if there are a number of concurrent requests coming in. 由于每个新线程共享虚拟地址空间，因此如果有多个并发请求进入，则内存使用率会不断攀升。

Nginx solves this by having just one listening process(Master), with a single execution thread AND 2 or 3(number is configurable) worker processes. Nginx通过只有一个监听进程（Master），一个执行线程AND 2或3（数字是可配置的）工作进程来解决这个问题。 This Master process/thread is running an event loop. 此主进程/线程正在运行事件循环。 Effectively waiting for any incoming request. 有效地等待任何传入的请求。 When a request comes in it gives that request to one of the worker processes. 当请求进入时，它会将请求提供给其中一个工作进程。

Please correct me if my above understanding is not correct 如果我的上述理解不正确，请纠正我

If the above is correct, then I have a few questions: 如果以上是正确的，那么我有几个问题：

1.) Isn't the worker process going to spawn multiple threads and going to run into the same problem as apache ? 1.）工作进程是否会产生多个线程并且会遇到与apache相同的问题？

2.) Or is nginx fast because its event based architecture uses nonblocking-IO underneath it all. 2.）或者nginx是快速的，因为它的基于事件的架构在它下面使用非阻塞IO。 Maybe the worker process spawns threads which do only non-blocking-IO, is that it ? 也许工作进程产生的线程只做非阻塞IO，是吗？

3.) What "exactly" is "event based architecture", can someone really simplify it, for soemone like me to understand. 3.）“完全”是什么“基于事件的架构”，有人可以真正简化它，对于像我这样的soemone来理解。 Does it just pertain to non-blocking-io or something else as well ? 它是否仅适用于非阻塞io或其他类似的东西？

I got a reference of c10k , I am trying to go through it, but I don't think its about event based arch. 我得到了c10k的参考，我试图通过它，但我不认为它是基于事件的拱。 it seems more for nonblocking IO. 对于非阻塞IO来说似乎更多。

3 个解决方案

Apache uses multiple threads to provide each request with it's own thread of execution. Apache使用多个线程为每个请求提供自己的执行线程。 This is necessary to avoid blocking when using synchronous I/O. 这对于避免在使用同步I / O时阻塞是必要的。

Nginx uses only asynchronous I/O, which makes blocking a non-issue. Nginx仅使用异步I / O，这使得阻止非问题。 The only reason nginx uses multiple processes, is to make full use of multi-core, multi-CPU and hyper-threading systems. nginx使用多个进程的唯一原因是充分利用多核，多CPU和超线程系统。 Even with SMP support, the kernel cannot schedule a single thread of execution over multiple CPUs. 即使支持SMP，内核也无法在多个CPU上调度单个执行线程。 It requires at least one process or thread per logical CPU. 每个逻辑CPU至少需要一个进程或线程。

So the difference is, nginx requires only enough worker processes to get the full benefit of SMP, whereas Apache's architecture necessitates creating a new thread (each with it's own stack of around ~8MB) per request. 所以不同的是，nginx 只需要足够的工作进程来获得SMP的全部好处，而Apache的架构需要为每个请求创建一个新线程（每个线程都有大约8MB左右的堆栈）。 Obviously, at high concurrency, Apache will use much more memory and suffer greater overhead from maintaining large numbers of threads. 显然，在高并发性的情况下，Apache将使用更多的内存，并且在维护大量线程时会遇到更大的开销。

It's not very complicated from a conceptual point of view. 从概念的角度来看，这并不是很复杂。 I'll try to be clear but I have to do some simplification. 我会尽力清楚，但我必须做一些简化。

The event based servers (like nginx and lighttpd ) use a wrapper around an event monitoring system. 基于事件的服务器（如nginx和lighttpd ）使用事件监视系统周围的包装器。 For example. 例如。 lighttpd uses libevent to abstract the more advanced high-speed event monitoring system (see libev also). lighttpd使用libevent来抽象更高级的高速事件监控系统（参见libev ）。

The server keeps track of all the non blocking connections it has (both writing and reading) using a simple state machine for each connection. 服务器使用每个连接的简单状态机跟踪它具有的所有非阻塞连接（写入和读取）。 The event monitoring system notifies the server process when there is new data available or when it can write more data. 当有新数据可用或何时可以写入更多数据时，事件监视系统会通知服务器进程。 It's like a select() on steroids, if you know socket programming. 就像你知道套接字编程一样，它就像是类固醇上的select() 。 The server process then simply sends the requested file using some advanced function like sendfile() where possible or turns the request to a CGI process using a socket for communication (this socket will be monitored with the event monitoring system like the other network connections.) 然后，服务器进程只使用sendfile()等高级函数发送请求的文件，或者使用套接字将请求转换为CGI进程进行通信（此套接字将使用事件监视系统监视，如其他网络连接。）

This link as a lot of great information about the internals of nginx, just in case. 这个链接作为关于nginx内部的很多很好的信息，以防万一。 I hope it helps. 我希望它有所帮助。

Apache doesn't spawn a new thread for each request. Apache不会为每个请求生成新线程。 It maintains a cache of threads or a group of pre-forked processes which it farms out requests to. 它维护一个线程缓存或一组预分叉进程，并将请求分配给它。 The number of concurrent requests are limited by the number of children/threads yes, but apache is not spawning a new thread/child for every request which would be ridiculously slow (even with threads, creation and teardown for every request would be way too slow) 并发请求的数量受到子/线程数的限制，但是apache并没有为每个请求产生一个新的线程/子进程，这会非常慢（即使有线程，每次请求的创建和拆除都会太慢））

Nginx uses a master-worker model. Nginx使用主工作者模型。 The master process deals with loading the configuration and creating/destroying/maintaining workers. 主进程处理加载配置和创建/销毁/维护工作程序。 Like apache it starts out with a number of pre-forked processes already running each of which is a worker (and one of which is the "master" process). 就像apache一样，它开始时已经运行了许多预先分叉的进程，每个进程都是一个worker（其中一个是“master”进程）。 EACH worker process share a set of listening sockets. 每个工作进程共享一组侦听套接字。 Each worker process accepts connections and processes them, but each worker can handle THOUSANDS of connections at once, unlike apache which can only handle 1 connection per worker. 每个工作进程都接受连接并处理它们，但每个工作程序可以同时处理数千个连接，这与apache不同，apache每个worker只能处理1个连接。

The way nginx achieves this is through "multiplexing". nginx实现这一目标的方式是通过“多路复用”。 It doesn't use libevent, it uses a custom event loop which was designed specifically for nginx and grew in development with the development of the nginx software. 它不使用libevent，它使用专门为nginx设计的自定义事件循环，并随着nginx软件的开发而在开发中增长。 Multiplexing works by using a loop to "increment" through a program chunk by chunk operating on one piece of data/new connection/whatever per connection/object per loop iteration. 多路复用通过使用循环来通过在一个数据/新连接/每个循环迭代的每个连接/对象上操作的块来“递增”通过程序块来工作。 It is all based on backends like Epoll() kqueue() and select(). 它都基于后端，如Epoll（）kqueue（）和select（）。 Which you should read up on 您应该阅读哪些内容