简体   繁体   English

Python 多处理池:maxtasksperchild

[英]Python multiprocessing pool: maxtasksperchild

I have been dabbling with Python's multiprocessing library and although it provides an incredibly easy to use API, it's documentation is not always very clear.我一直在涉足 Python 的多处理库,虽然它提供了一个非常容易使用的 API,但它的文档并不总是很清楚。 In particular, the argument 'maxtasksperchild' passed to an instance of the Pool class I find very confusing.特别是,参数“maxtasksperchild”传递给 Pool 类的一个实例,我觉得非常混乱。

The following comes directly from Python's documentation (3.7.2):以下内容直接来自 Python 的文档 (3.7.2):

maxtasksperchild is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. maxtasksperchild是工作进程在退出并被新工作进程替换之前可以完成的任务数,以释放未使用的资源。 The default maxtasksperchild is None, which means worker processes will live as long as the pool.默认的 maxtasksperchild 是 None,这意味着工作进程将与池一样长。

The above raises more questions for me than it answers.以上对我提出的问题比它回答的要多。 Is it bad for a worker process to live as long as the pool?工作进程与池一样长是否有害? What makes a worker process 'fresh' and when is that desired?是什么让工作进程“新鲜”,什么时候需要? In general, when should you set the value for maxtasksperchild explicitly instead of letting it default to 'None' and what are considered best practices in order to maximize processing speed?一般来说,什么时候应该明确设置 maxtasksperchild 的值而不是让它默认为“无”,以及为了最大化处理速度而被认为是最佳实践?

From @Darkonaut's amazing answer on chunksize I now understand what chunksize does and represents.@Darkonaut 关于 chunksize 的惊人回答中,我现在明白了 chunksize 的作用和代表的含义。 Since supplying a value for chunksize impacts the number of 'tasks', I was wondering if there are any considerations that should be made regarding their dependence to ensure maximum performance?由于为 chunksize 提供值会影响“任务”的数量,我想知道是否应该考虑它们的依赖性以确保最大性能?

Thanks!谢谢!

Normally you don't need to touch this.通常你不需要触摸它。 Sometimes there can arise problems with code calling outside Python leaking memory for example.例如,有时会出现 Python 外部调用代码泄漏内存的问题。 Limiting the number of tasks a worker-process does before he gets replaced then helps because the "unused resources" he erroneously accumulates are released when the process gets scrapped.限制工作进程在被替换之前执行的任务数量会有所帮助,因为当进程报废时,他错误积累的“未使用的资源”会被释放。 Starting a new, "fresh" process then keeps the problem contained.开始一个新的、“新鲜”的过程然后保持问题得到控制。 Because replacing a process needs time, for performance you let maxtasksperchild at default.因为替换进程需要时间,为了提高性能,您默认使用maxtasksperchild When you run into unexplainable resource problems some day, you can try setting maxtasksperchild=1 to see if this changes something.当有一天您遇到无法解释的资源问题时,您可以尝试设置maxtasksperchild=1以查看这是否会改变某些内容。 If it does, it's likely something is leaking something .如果是的话,很可能一些漏水的东西

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM