简体   繁体   English

不能多进程用户定义的代码-无法腌制

[英]can't multiprocess user defined code - cannot pickle

I tried using multiprocessing on my task, which generally means to do some calculation and then pass back the result. 我尝试对任务使用多处理,这通常意味着先进行一些计算然后将结果传回。 The problem is that the code defining the calculation is defined by user, it is compiled from string before the execution. 问题是定义计算的代码是由用户定义的,它是在执行之前从字符串编译的。 This works perfect using exec() , eval() or compile() etc. when being run in the main process. 在主进程中运行时,这可以使用exec()eval()compile()等实现完美工作。 The example below works only for f1 function but not for f2 . 以下示例仅适用于f1函数,不适用于f2 I get 'Can't pickle class 'code'`. 我得到“不能腌制类”代码”。 Is there any way round this? 有什么办法解决吗? For example using multiprocessing differently? 例如以不同方式使用多重处理? Or using other package? 还是使用其他包装? Or some more low level stuff? 还是一些低级的东西? Unfortunatelly passing the string to the process and then compiling inside the process is not an option for me because of the design of the whole application (ie the code string is 'lost' and only the compiled version is available). 不幸的是,由于整个应用程序的设计,将字符串传递给进程然后在进程内部进行编译对我来说不是一个选择(即代码字符串“丢失”,只有编译后的版本可用)。

import multiprocessing

def callf(f, a):
    exec(f, {'a': a})

if __name__ == "__main__":
    f = compile("print(a)", filename="<string>", mode="exec")
    callf(f, 10)  # this works
    process = multiprocessing.Process(target=callf, args=(f, 20))  # this does not work
    process.start()
    process.join()

UPDATE: here is another attempt, which is actually closer to my actual need. 更新:这是另一种尝试,实际上更接近我的实际需求。 It results in different error message, but also cannot pickle the function. 这会导致出现不同的错误消息,但也不会使该函数出错。

import multiprocessing

if __name__ == "__main__":
    source = "def f(): print('done')"
    locals = dict()
    exec(source, {}, locals)
    f = locals['f']
    f()  # this works
    process = multiprocessing.Process(target=f)  # this does not work
    process.start()
    process.join()

pickle can't serialize code objects but dill can. pickle不能序列化代码对象,但dill可以。 There is a dill-based fork of multiprocessing called multiprocessing_on_dill but I have no idea how good it is. 有一个基于莳萝的多处理分支,称为multiprocessing_on_dill,但我不知道它有多好。 You could also just dill-encode the code object to make standard multiprocessing happy. 您也可以对代码对象进行莳萝编码,以使标准的多处理过程变得愉快。

import multiprocessing
import dill

def callf_dilled(f_dilled, a):
    return callf(dill.loads(f_dilled), a)

def callf(f, a):
    exec(f, {'a': a})

if __name__ == "__main__":
    f = compile("print(a)", filename="<string>", mode="exec")
    callf(f, 10)  # this works
    process = multiprocessing.Process(target=callf_dilled, 
        args=(dill.dumps(f), 20))  # now this works too!
    process.start()
    process.join()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM