简体   繁体   English

在Python中使用多处理模块时,如何确保每个进程使用大致相同的时间?

[英]How to make sure each process uses roughly same amount of time when using multiprocessing module in Python?

Currently I am working on an asynchronous gradient algorithm with Python multiprocessing module, the main idea is that I run multiple processes that update an array of global parameters asynchronously. 目前,我正在使用Python多处理模块开发异步梯度算法,其主要思想是运行多个进程以异步方式更新全局参数数组。 I have finished most of the framework but I got a problem that some processes seems to "get stuck" sometimes while other are still running, that causes this algorithm less effective. 我已经完成了大部分框架,但遇到一个问题,即某些进程似乎在某些进程“卡住”而其他进程仍在运行时,则导致该算法的效果降低。 So I am wondering if there are good ways to make sure that they use roughly the same amount of time? 所以我想知道是否有好的方法来确保它们使用大致相同的时间?

Thanks! 谢谢!

This depends almost entirely on the problem you try to tackle. 这几乎完全取决于您要解决的问题。 If you distribute a large task to several workers and one unpredictably gets a much larger chunk than the others, you will have this situation. 如果您将一项大型任务分配给多个工作人员,而一个任务却比其他任务大得多,那么您将遇到这种情况。

There are several options to avoid it: 有几种选择可以避免这种情况:

  1. Try to estimate the effort for each chunk more precisely. 尝试更精确地估计每个块的工作量。 Depending on your task, this might be possible. 根据您的任务,这可能是可能的。 The chunks with the most predicted effort should be split. 预计工作量最大的块应该被拆分。
  2. A very common way to approach this is to split the task into lots of very small chunks, many more than workers are present. 解决此问题的一种非常常见的方法是将任务分成很多非常小的块,远远超过了现有的工人。 Then feed all chunks into a queue and let your workers eat their chunks from the queue. 然后将所有块放入队列,让您的工作人员从队列中吃掉它们的块。 This way when a worker receives an easy chunk it will finish it fast and take at once the next chunk from the queue, thus not ending up idle while other workers seem to be "stuck" with their harder chunk. 这样,当一个工作人员接收到一个简单的数据块时,它将很快完成并立即从队列中取出下一个数据块,从而不会最终变得空闲,而其他工作人员似乎被他们较难的数据块“卡住”了。

A real deadlock will not be fixed of course by whatever approach. 真正的僵局当然不会通过任何方法解决。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用python多处理模块重新启动进程 - How to restart a process using python multiprocessing module 如果 Python 多处理池中有 4 个子进程,如何确保所有工作人员至少运行一次任务 - If there are 4 child process in Python Multiprocessing Pool, How can I make sure all worker run task at least one time 如何在使用模块NetworkX时大致估算计算时间 - How to Estimate Roughly The Computation Time While Using Module NetworkX Python,multiprocessing.pool与for循环花费的时间大致相同 - Python, multiprocessing.pool took about the same amount of time as a for loop 多处理时如何获取每个进程ID - How to get each Process ID when multiprocessing 如何使用多处理模块终止进程? - How to kill a process using the multiprocessing module? 使用Python进行多处理时,是否在每个进程中都复制全局变量? - Are global variables get replicated in each process when doing multiprocessing in Python? 多处理模块显示与主进程相同的每个子进程的内存。 - Multiprocessing module showing memory for each child process same as Main process. 如何使用python的多处理终止进程 - how to to terminate process using python's multiprocessing 如何在 Python 中使用多处理终止进程? - How to terminate a process using multiprocessing in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM