简体   繁体   English

Python 多处理映射和应用不能并行运行?

[英]Python multiprocessing map and apply doesn't run in parallel?

I am confused about the python multiprocessing module.我对 python 多处理模块感到困惑。 Suppose we write the code like this:假设我们这样写代码:

pool = Pool()
for i in len(tasks) :
    pool.apply(task_function, (tasks[i],))

Firstly i = 0, and the first subprocessor will created and execute the first task.首先 i = 0,第一个子处理器将创建并执行第一个任务。 Since we are using the apply instead of apply_async, the main processor is blocked, so there is no chance that i get increment, and execute the second task.由于我们使用的是apply 而不是apply_async,主处理器被阻塞,所以我没有机会获得增量并执行第二个任务。 So by doing this way, we are actually write a serial code, not run in multiprocessing?那么通过这种方式,我们实际上是在编写串行代码,而不是在多处理中运行? So the same is true when we use map instead of map_async?那么当我们使用 map 而不是 map_async 时也是如此吗? No wonder the result of these tasks comes in order.难怪这些任务的结果是有序的。 If this is the truth, we don't even bother to use multiprocessing's map and apply function.如果这是事实,我们甚至都懒得使用多处理的 map 和 apply 函数。 Correct me, if I am wrong纠正我,如果我错了

According to the documentation :根据文档

apply(func[, args[, kwds]])应用(功能[,参数[,kwds]])

Equivalent of the apply() built-in function.等效于 apply() 内置函数。 It blocks until the result is ready, so apply_async() is better suited for performing work in parallel.它会阻塞直到结果准备好,因此 apply_async() 更适合并行执行工作。 Additionally, func is only executed in one of the workers of the pool.此外, func 仅在池的其中一个工作程序中执行。

So yes, if you want to delegate work to another process and return control to your main process, you have to use apply_async .所以是的,如果您想将工作委托给另一个进程并将控制权返回给您的主进程,则必须使用apply_async

Regarding your statement:关于你的说法:

If this is the truth, we don't even bother to use multiprocessing's map and apply function如果这是事实,我们甚至都懒得使用 multiprocessing 的 map 和 apply 函数

Depends on what you want to do.取决于你想做什么。 For example map will split the arguments into chunks and apply the function for each chunk in the different processes of the pool, so you are achieving parallelism.例如map将参数分成块和应用功能在池的不同处理每个数据块,所以实现并行。 This would work for your example:这适用于您的示例:

pool.map(task_funcion, tasks)

It will split tasks into pieces, and then call task_function on each process from the pool with the different pieces of tasks .它将tasks拆分为多个部分,然后对池中的每个进程调用task_function ,其中包含不同的tasks So for example you could have Process1 running task_function(task1) , Process2 running task_function(task2) all at the same time.例如,您可以让 Process1 运行task_function(task1) , Process2 同时运行task_function(task2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM