简体   繁体   中英

Python:When to use Threads vs. Multiprocessing

在效率和代码清晰度方面,在决定使用线程或多处理时,要遵循哪些好的指导原则?

Many of the differences between threading and multiprocessing are not really Python-specific, and some differences are specific to a certain Python implementation.

For CPython, I would use the multiprocessing module in either fo the following cases:

  • I need to make use of multiple cores simultaneously for performance reasons. The global interpreter lock (GIL) would prevent any speedup when using threads. (Sometimes you can get away with threads in this case anyway, for example when the main work is done in C code called via ctypes or when using Cython and explicitly releasing the GIL where approriate. Of course the latter requires extra care.) Note that this case is actually rather rare. Most applications are not limited by processor time, and if they really are, you usually don't use Python.

  • I want to turn my application into a real distributed application later. This is a lot easier to do for a multiprocessing application.

  • There is very little shared state needed between the the tasks to be performed.

In almost all other circumstances, I would use threads. (This includes making GUI applications responsive.)

For code clarity , one of the biggest things is to learn to know and love the Queue object for talking between threads (or processes, if using multiprocessing ... multiprocessing has its own Queue object ). Queues make things a lot easier and I think enable a lot cleaner code.

I had a look for some decent Queue examples, and this one has some great examples of how to use them and how useful they are (with the exact same logic applying for the multiprocessing Queue): http://effbot.org/librarybook/queue.htm

For efficiency , the details and outcome may not noticeably affect most people, but for python <= 3.1 the implementation for CPython has some interesting (and potentially brutal), efficiency issues on multicore machines that you may want to know about. These issues involve the GIL . David Beazley did a video presentation on it a while back and it is definitely worth watching. More info here , including a followup talking about significant improvements on this front in python 3.2.

Basically, my cheap summary of the GIL-related multicore issue is that if you are expecting to get full multi-processor use out of CPython <= 2.7 by using multiple threads, don't be surprised if performance is not great, or even worse than single core. But if your threads are doing a bunch of i/o (file read/write, DB access, socket read/write, etc), you may not even notice the problem.

The multiprocessing module avoids this potential GIL problem entirely by creating a python interpreter (and GIL) per processor.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM