简体   繁体   English

如何为多处理工作者提供专用变量,在调用之间保持其值?

[英]How to have dedicated variable for multiprocessing worker, which keeps its value between calls?

I have the following code:我有以下代码:

pool = Pool(cpu_count())
pool.imap(process_item, items, chunksize=100)

In the process_item() function I am using structures which are resource demanding to create, but it would be reusable.process_item()函数中,我使用的结构需要创建资源,但它是可重用的。 (but not concurrently shareable) Currently within each call of process_item() it creates the resource in a local variable repeatedly. (但不能同时共享)当前在每次调用process_item()时,它会在局部变量中重复创建资源。 It would be great performance benefit to create once (for each worker) then reuse创建一次(为每个工作人员)然后重用将是巨大的性能优势

Question问题

How to have delegated cpu_count() instances for those resource, and how to implement the process_item() function to access the appropriate delegated instance belonging that particular worker?如何为这些资源委托cpu_count()实例,以及如何实现process_item()函数来访问属于该特定工作人员的适当委托实例?

If you cannot use anything outside the standard library, I would suggest using using an initializer when creating the pool:如果你不能使用标准库之外的任何东西,我建议在创建池时使用initializer

from multiprocessing import Pool, Manager, Process
import os
import random

class A:

    def __init__(self):
        self.var = random.randint(0, 1000)

    def get(self):
        print(self.var, os.getpid())


def worker(some_arg):
    global expensive_var
    expensive_var.get()

def initializer(*args):
    global expensive_var
    expensive_var = A()


if __name__ == "__main__":
    pool = Pool(8, initializer=initializer, initargs=())
    for result in pool.imap(worker, range(100)):
        continue

Create your local variables inside the initializer , and make them global.initializer中创建局部变量,并使它们成为全局变量。 Then you can use them inside the function you are passing to the pool.然后,您可以在传递给池的函数中使用它们。 This works because the initializer is executed in when each process of the pool starts.这是因为initializer是在池的每个进程启动时执行的。 So making them global would make it a global variable in the scope of the child process only, allowing access to it during execution of the function you passed to the pool.因此,将它们设为global将使其成为仅在子进程范围内的全局变量,允许在执行您传递给池的函数期间访问它。

There was a stackoverflow answer that explained all this better, but I can't seem to find it for now.有一个 stackoverflow 答案可以更好地解释这一切,但我现在似乎找不到它。 But this is basically the gist of it.但这基本上是它的要点。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 变量保持其价值 - Variable keeps its value 我确实希望每个多处理池工作者都有自己的全局变量副本并能够修改它 - I do want each multiprocessing pool worker to have its own copy of the global variable and be able to modify it 如何在python多处理中具有全局/公共变量 - How to have a global/common variable in python multiprocessing Python:如何有一个变量的值而不是它的地址 - Python: How would it be possible to have the value of a variable instead of its address 如何获取唯一的键值以区分python multiprocessing.Process之间的变量? - How can I get unique key value to distinguish a variable between python multiprocessing.Process? 如何访问 Qthread 中包含的多处理工作程序本身的变量? - How do I access to a variable inside a multiprocessing worker itself contained in a Qthread? 如何在函数调用之间保持变量值 - How to maintain variable values between function calls 如何在通过多处理子方法设置的主类中使用os环境变量? 多处理方案 - How to use os environment variable in main class, which set through multiprocessing sub methods? multiprocessing scenario 如何在进程之间共享日期变量 - Multiprocessing python - How to share the date variable between processes - Multiprocessing python 在等待来自多处理的“值”或“数组”时如何处理工作进程? - How do you make a worker process block while waiting for Value or Array from multiprocessing?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM