简体   繁体   中英

Is it better to use select or multi-threading (or both) when creating a server that will simultaneously send/receive tasks for 200+ clients

I am creating a server that will be sending and receiving tasks from over 200 clients simultaneously (potentially more client in the future). There will also be background engines on the clients that will perform tasks and send responses to the server without asking first. I expect there to be a high volume of information transferred both ways. I've been doing research into multi-threading and using the select function, and I'm wondering given some of the parameters of the project which option (or a combination) would be the most efficient scalable solution based on the amount of traffic that might occur.

Any suggestions would be greatly appreciated. I'd be glad to answer any questions to provide more clarity.

Either approach will work; as far is which is "better", that's going to depend a lot on how you define the word "better".

  • The single-threaded approach avoids any chance of problems with race conditions or deadlocks, because those problems inherently can't occur in a single-threaded program. In a multithreaded program you have to be extremely careful about data-locking patterns, or else you will find yourself trying to debug very mysterious malfunctions that only occur once every few days/weeks/months.

  • On the other hand, the single-threaded approach limits you to using a single core; it won't be able to take advantage of a modern multi-core CPU to give you a parallelism speedup.

  • On the third hand, the multi-threaded approach can get hairy (and lose its speedup potential) if the various threads/connections often need to access any shared/mutable data structures. In that "shared data bottleneck" scenario, the threads may spend a lot of their time blocked waiting to lock a mutex, and then you're mostly back to using a single core anyway. If each connection operates independently of the others (eg as part of a simple web server) and doesn't need to interact with the other threads, then this shouldn't be a concern.

  • Multithreading allows you to use blocking I/O (which is simpler to implement than non-blocking I/O), but blocking I/O limits your control over the threads (eg how do you get a thread to exit cleanly, or take some other non-client-initiated action, if it is blocked indefinitely inside a recv() call? There aren't any good solutions to that problem, only poor ones)

  • Single-threading requires you to use non-blocking I/O (otherwise a single unresponsive client can halt service to all the other clients while the server is blocked inside a send() or recv() call), and non-blocking I/O is tricky to do correctly, since you have to handle partial-reads and partial-writes gracefully.

  • If your program ever needs to do a non-trivial amount of computation or file I/O, note that a single-threaded design will force all clients to wait while the computation (or I/O) for any client completes. In a multithreaded design, OTOH, clients B through Z can continue to be serviced on other cores/threads while client A's is busy reading from the disk or crunching numbers.

  • The overhead of spawning and maintaining threads will vary from one OS to another. If you're going to be running hundreds of threads simultaneously, you might want to verify first that your target OS (and hardware) will be able to handle that load efficiently. (You can reduce the overhead of spawning and reaping threads via a thread-pool, at some expense of increased RAM usage)

I personally prefer the single-threaded/non-blocking-I/O approach, because blocking I/O is problematic if you want your program to be able to shut down cleanly and reliably (which you should want, if only so you can do eg memory-leak testing under valgrind). If single-core performance turns out to be insufficient, it's often fairly straightforward extend the handle-N-sockets-on-1-thread design to a more powerful handle-N-sockets-on-each-of-M-threads design, and then you can play around with different values of N and M until you find the one that gives you the best performance (eg by setting M to the number of cores on the host machine, and handing out newly-accepted sockets to whichever thread is currently handling the smallest number of sockets)

I once made a program in Java, a chat application, that each connection with the server that was established, represented a new Thread in the server, to manage the client in question.

Inside the Server class, there was a static variable, to manage which clients were connected.

I don't know if recommend different technologies is the right way to answer you question, but i think, that for your case, would be a good idea to take a look at Erlang/Elixir platform, the premise is the is able to hold a lot of clients at the same time.

Currently, big companies, like Whatsapp uses Erlang and Discord Elixir.

I hope that my answer was helpful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM