简体   繁体   English

什么是最好的,单线程或多线程服务器?

[英]What is best, a Single-threaded or a multi-threaded server?

I have to create a simple client<->server communication to transfer files using C language (Linux). 我必须创建一个简单的客户端< - >服务器通信来使用C语言(Linux)传输文件。

The server accept connections on the 10000 port, I don't know if is better to create a new thread each request or create a fixed numbers of threads and use asynchronous technique. 服务器接受10000端口上的连接,我不知道是否更好地为每个请求创建一个新线程或创建固定数量的线程并使用异步技术。

CASE A:

client --> server --> (new thread) --> process the request

CASE B:

SERVER --> create thread 1 - thread 2 - thread 3

then

client1 --> server --> thread 1
client2 --> server --> thread 2
client3 --> server --> thread 3
client4 --> server --> thread 1
client5 --> server --> thread 2
client6 --> server --> thread 3

In this case thread 1 could process many client's requests 在这种情况下,线程1可以处理许多客户端的请求

My considerations: 我的考虑:

CASE 1: Is faster but waste a lot of memory 情况1:速度更快但浪费了大量内存

CASE 2: Is slower but use a low memory 情况2:速度较慢但使用较低的内存

Am I wrong? 我错了吗?

If you consider checking architecture of widely used http servers ( nginx, lighttpd, apache ) you'll notice, that ones using fixed thread count ( so called "worker threads", their amount should depend on processsor count on server ) are a lot faster then one using large thread pool. 如果你考虑检查广泛的HTTP服务器(Nginx的,lighttpd的,阿帕奇),你会发现,那那些使用固定的线程数(所谓的“工作线程”,其数额应取决于processsor计数在服务器上)的架构是快了很多然后一个使用大线程池。 However, there are very important moments: 但是,有非常重要的时刻:

  1. "Worker thread" implementation should not be as straightforward as it is tempting, or there will be no real performance gain. “工作者线程”的实现不应该像诱人的那样简单,否则就不会有真正的性能提升。 Each thread should implement each one pseudo concurrency using state machine, processing multiple requests at time. 每个线程都应该使用状态机实现每个伪并发,并及时处理多个请求。 No blocking operations can be allowed here - for example, the time in thread that is wated to wait for I/O from hard drive to get file contents can be used to parse request for next client. 这里不允许阻塞操作 - 例如,从硬盘驱动器等待I / O获取文件内容的线程中的时间可用于解析对下一个客户端的请求。 That is pretty difficult code to write, though. 不过,编写代码非常困难。

  2. Thread-based solution ( with re-usable thread pool, as thread creation IS heavyweight operation ) is optimal when considering performance vs coding time vs code support. 在考虑性能与编码时间与代码支持时,基于线程的解决方案(具有可重用的线程池,因为线程创建是重量级操作)是最佳的。 If your server is not supposed to handle thousands of requests per second, you'll get ability to code in pretty natural blocking style without risking to fail in performance completely. 如果您的服务器不应该每秒处理数千个请求,那么您将能够以非常自然的阻塞方式进行编码,而不会有完全失败的风险。

  3. As you can notice, "worker thread" solutions itself service only statical data, they proxy dynamic script execution to some other programs. 您可以注意到,“工作线程”解决方案本身仅为静态数据提供服务,它们将动态脚本执行代理到其他程序。 As far as I know ( may be wrong ), that is due to complexities with non-blocking processing of request with some unknown dynamic stuff executed in their context. 据我所知(可能是错的),这是由于非阻塞处理请求的复杂性以及在其上下文中执行的一些未知动态内容。 That should not be an issue in your case, anyway, as you speak about simple file transfer. 无论如何,当你谈到简单的文件传输时,这不应该是你的问题。

The reason why limited thread solution is faster on heavy-load systems - thread http://en.wikipedia.org/wiki/Context_switch is pretty costful operation, as it requires saving data from registers and loading new one, as long as some other thread-local data. 有限线程解决方案在重负载系统上更快的原因 - 线程http://en.wikipedia.org/wiki/Context_switch是相当昂贵的操作,因为它需要从寄存器保存数据并加载新的数据,只要其他一些线程本地数据。 If you have too many threads compared to process count ( like 1000x more ), a lot of time in your application will be wasted simply switching between threads. 如果你有太多的线程与进程数相比(例如多1000倍),你的应用程序中的大量时间将被浪费在线程之间简单切换。

So, short answer to your question is: "No, it has nothing to do with memory usage, choice is all about type of data served, planned request/second count and ability to spend a lot of time on coding". 因此,对您的问题的简短回答是:“不,它与内存使用无关,选择的是所提供的数据类型,计划的请求/秒计数以及在编码上花费大量时间的能力”。

There's no right answer here. 这里没有正确的答案。 Depends on a lot of things. 取决于很多事情。 And you need to choose by yourself. 你需要自己选择。

"CASE 1: Is faster but waste a lot of memory “情况1:速度快但浪费了大量内存
"CASE 2: Is slower but use a low memory" “案例2:速度较慢但使用的内存较低”

Wrong. 错误。 Depends on a lot of things. 取决于很多事情。 Creating threads is not that expensive (it is, but not that much), but if the threads got too many, you'll have a problem. 创建线程并不是那么昂贵(虽然但不是那么多),但如果线程太多,你就会遇到问题。

This depends on the load very much - what is the expected load? 这很大程度上取决于负载 - 预期负载是多少? If it is, lets say, about 1000 requests per second - you know, if you create 1000 threads each second..... this will be disaster :D 如果它是,比方说,每秒大约1000个请求 - 你知道,如果你每秒创建1000个线程......这将是灾难:D

Also - create as many threads, as the CPU will be able to handle, without (much) switching between them. 另外 - 创建尽可能多的线程,因为CPU可以处理它们,而不需要(很多)在它们之间切换。 There's a big chance (depends on your program, of course), a single-core CPU to work much, much slower with 10 threads, instead of 1(or 2). 有一个很大的机会(当然取决于你的程序),一个单核CPU工作得多,10个线程慢得多,而不是1个(或2个)。 This really depends on what will these threads do, too. 这实际上取决于这些线程的作用。

I'd choose to create a thread pool and reuse the threads. 我选择创建一个线程池并重用线程。

My first choice would be to do it single-threaded using select(2). 我的第一选择是使用select(2)进行单线程。 If that wasn't good enough performance-wise I'd go with a thread-pool solution. 如果这在性能方面不够好,我会使用线程池解决方案。 It will scale better. 它会更好地扩展。

There are times where creating one thread per client is perfectly ok. 有时候每个客户端创建一个线程是完全可以的。 I've done that and it worked well for that application, with usually around 100 client up to a maximum of 1000 clients. 我已经做到了,它适用于该应用程序,通常有大约100个客户端,最多1000个客户端。 That was 15 years ago. 那是15年前的事了。 Today the same application can probably handle 10000 clients due to better hardware. 今天,由于更好的硬件,相同的应用程序可以处理10000个客户端。 Just be aware that one thread per client doesn't scale very well. 请注意,每个客户端的一个线程不能很好地扩展。

I know it has been quite a while since you asked this, but here's my take on your question from a perspective of someone who already wrote a handful of servers in C. 我知道你提出这个问题已经有一段时间了,但是从一个已经在C中写过一些服务器的人的角度来看,这是我对你的问题的看法。

If your server is your own, totally dependent, and non-dependent on others codes, I would highly recommend that you do it single-threaded with non-blocking sockets and using epoll (Linux), kqueue (BSD) or WSAEventSelect (Windows). 如果您的服务器是您自己的,完全依赖并且不依赖于其他代码,我强烈建议您使用非阻塞套接字进行单线程并使用epoll(Linux),kqueue(BSD)或WSAEventSelect(Windows) 。

This may require that you split down a code that would have been otherwise "simple" to much smaller chunks, but if scalability is what you're looking after, this will beat any thread-based / select based servers. 这可能要求您将一个本来就“简单”的代码拆分为更小的块,但如果您正在寻找可扩展性,那么这将胜过任何基于线程/基于选择的服务器。

There was a great article once called "The C10K Problem" that is focused entirely around the problem of how to handle 10,000 concurrent connections. 有一篇很棒的文章曾经被称为“C10K问题”,完全围绕如何处理10,000个并发连接的问题。 I actually learned alot myself from it, Here's the link to it: http://www.kegel.com/c10k.html . 我实际上从中学到了很多东西,以下是它的链接: http//www.kegel.com/c10k.html

There is also another great article that focus around scalability called "Scalable Networking" that you can find here: http://bulk.fefe.de/scalable-networking.pdf . 另外还有一篇很棒的文章,专注于称为“可扩展网络”的可扩展性,你可以在这里找到: http//bulk.fefe.de/scalable-networking.pdf

Those two are great reads, hope that helps. 这两个是伟大的读物,希望有所帮助。

This is entirely down to you. 完全取决于你。 There is no right or wrong answer. 没有正确或错误的答案。 You've identified the pros and cons of both already and you're right with both of those; 你已经确定了两者的优点和缺点,你对这两者都是正确的; 1 is faster but more intensive, 2 is slower because clients may have to wait. 1更快但更密集,2更慢,因为客户可能不得不等待。

I would go with the pool of pre created threads and re-use them when they are done with the current request they're handling. 我会使用预先创建的线程池,并在完成当前正在处理的请求时重新使用它们。 Creating threads can be expensive, as it mostly involves calls into the kernel. 创建线程可能很昂贵,因为它主要涉及对内核的调用。

There is "threadpool" type project here using pthreads. 有“线程池”类型的项目在这里使用并行线程。 Perhaps you can get some ideas from there on how to implement. 也许你可以从中获得一些关于如何实施的想法。

It really depends on what your server is doing. 实际上取决于您的服务器在做什么。

I would recommend that you do the simplest thing possible. 我建议你做最简单的事情。 This is probably a single-process model, which multiplexes all available connections using, select, poll, libevent or similar. 这可能是一个单进程模型,它使用,select,poll,libevent或类似方式多路复用所有可用连接。

That is, if you're using TCP. 也就是说,如果你正在使用TCP。

If you use UDP, it's even easier as the application can do everything with one socket, so it can (possibly) use a blocking socket. 如果您使用UDP,它甚至更容易,因为应用程序可以使用一个套接字执行所有操作,因此它可以(可能)使用阻塞套接字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM