简体   繁体   English

IO 密集型任务的多线程和 CPU 密集型任务的多处理

[英]multithreading for IO-bound tasks and multiprocessing for CPU-bound tasks

https://realpython.com/async-io-python/ gives an introduction about multithreading and multiprocessing, but it does not make clear what is is valid in general or valid only in the Python environment. https://realpython.com/async-io-python/介绍了多线程和多处理,但没有说明什么是一般有效或仅在 Python 环境中有效。 For instance, it says:例如,它说:

concurrency encompasses both multiprocessing (ideal for CPU-bound tasks) and threading (suited for IO-bound tasks)并发包括多处理(适用于 CPU 密集型任务)和线程(适用于 IO 密集型任务)

I have developed concurrent apps with other programming languages such as C/C++ before and this statement seems odd to me.我之前用 C/C++ 等其他编程语言开发了并发应用程序,这句话对我来说似乎很奇怪。 Why multithreading would not be suited for CPU-bound tasks and multiprocessing for IO-bound tasks in general ?为什么多线程通常不适合 CPU 密集型任务和 IO 密集型任务的多处理? AFAIK both could be used effectively for both tasks. AFAIK 都可以有效地用于这两项任务。 Deciding between both depends on other criteria, such as task granularity, amount of shared state and execution order dependency between tasks and the process/thread creation cost (higher for processes, especially in some OSes).两者之间的决定取决于其他标准,例如任务粒度、共享状态的数量和任务之间的执行顺序依赖性以及进程/线程创建成本(进程更高,尤其是在某些操作系统中)。 Is the statement above specific to the Python environment and its global lock interpreter limitations?上面的语句是否特定于 Python 环境及其全局锁解释器限制?

As Rob Pike, Co-inventor of Go language said:正如 Go 语言的共同发明者 Rob Pike 所说:

  • Concurrency is about dealing with lots of things at once.并发是关于同时处理很多事情。
  • Parallelism is about doing lots of things at once.并行是关于同时做很多事情。
  • Not the same but related.不一样但相关。
  • One is about structure, one is about execution.一是关于结构,一是关于执行。
  • Concurrency provides a way to structure a solution to solve a problem that may(but not necessarily) be parallelizable.并发提供了一种构建解决方案的方法,以解决可能(但不一定)可并行化的问题。

From Luciano Ramalho book, "Fluent Python" Chapter 18, page 557.来自 Luciano Ramalho 的书,“Fluent Python”第 18 章,第 557 页。

What they are trying to say is that multiprocessing (it is also the name of the library used in Python for parallelism) is the way to solve issues when you indeed want multiple CPU bound tasks or parallel tasks.他们想说的是,当您确实需要多个 CPU 绑定任务或并行任务时,多处理(它也是 Python 中用于并行的库的名称)是解决问题的方法。 In Python, this is achieved by bypassing the GIL using for example the Python Multiprocessing module在 Python 中,这是通过使用Python Multiprocessing 模块等绕过 GIL 来实现的

In Python, there is something called the GIL that allows only to run one thread at the time.在 Python 中,有一种叫做 GIL 的东西,它允许一次只运行一个线程。 You will need to bypass the GIL to use parallelism.您需要绕过 GIL 才能使用并行性。 Meanwhile, you can achieve concurrency even with the GIL limitation: Only one thread will run at a time!!同时,即使有 GIL 限制,您也可以实现并发:一次只能运行一个线程!!

In Python you can achieve concurrency with:在 Python 中,您可以通过以下方式实现并发:

  • Threads,线程,
  • Futures(thread based)期货(基于线程)
  • and Async I/O ( not thread based, but event loops and cooperative multitask)和异步 I/O(不是基于线程,而是基于事件循环和协作多任务)

So as you can see you have 3 way of concurrency, but because of the GIL limitation you can not use it in Parallel tasks, or CPU-bound tasks正如你所看到的,你有 3 种并发方式,但由于 GIL 的限制,你不能在并行任务或 CPU 密集型任务中使用它

I found an article that could help you with Concurrency and Async I/O我找到了一篇可以帮助您处理并发和异步 I/O 的文章

Concurrency in Python with Async I/O Python 中的并发与异步 I/O

To achieve parallelism in Python you need to bypass the GIL.要在 Python 中实现并行性,您需要绕过 GIL。 A python module that helps with that is called "multiprocessing".一个有助于解决这个问题的 python 模块称为“多处理”。

Regarding your doubt:关于你的疑惑:

...(which says that multiprocessing is ideal for CPU-bound tasks and multithreading is ideal for IO-bound tasks) only applies to the Python environment.. ...(它说多处理适用于 CPU 密集型任务,多线程适用于 IO 密集型任务)仅适用于 Python 环境。

I can not say if it only applies to Python, as I do not know all the other languages, but for example, Javascript is notorious for his async I/O approach meanwhile C#, C++, Java achieve Concurrency and Parallelism without any inconvenience or limitation using Threads.我不能说它是否只适用于 Python,因为我不了解所有其他语言,但例如,Javascript 以其异步 I/O 方法而臭名昭著,同时 C#、C++、Java 实现并发和并行,没有任何不便或限制使用线程。 C# as JavaScript also implemented Async I/O a long time ago. C# as JavaScript 很久以前也实现了异步 I/O。

Both, David Beazley and Łukasz Langa mentioned that fact in the below talks David Beazley 和 Łukasz Langa 在下面的谈话中都提到了这个事实

  • David Beazley, Keynote at PyCon Brazil 2015 David Beazley,2015 年巴西 PyCon 主题演讲

  • David Beazley, Curious Course on Coroutines and Concurrency David Beazley, Curious Course on Coroutines and Concurrency

  • Łukasz Langa, Thinking In Coroutines - PyCon 2016 Łukasz Langa,协程思考 - PyCon 2016

The links are in the below presentation as well链接也在下面的演示文稿中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM