[英]How to pickle functions/classes defined in __main__ (python)
I would like to be able to pickle a function or class from within __main__, with the obvious problem (mentioned in other posts) that the pickled function/class is in the __main__ namespace and unpickling in another script/module will fail. 我希望能够在__main__中挑选一个函数或类,其中明显的问题(在其他帖子中提到),pickle函数/类在__main__命名空间中,而在另一个脚本/模块中解开将失败。
I have the following solution which works, is there a reason this should not be done? 我有以下解决方案可行,是否有理由不应该这样做?
The following is in myscript.py: 以下是myscript.py:
import myscript
import pickle
if __name__ == "__main__":
print pickle.dumps(myscript.myclass())
else:
class myclass:
pass
edit : The unpickling would be done in a script/module that has access to myscript.py and can do an import myscript
. 编辑 :unpickling将在一个脚本/模块中完成,该脚本/模块可以访问 myscript.py并可以执行import myscript
。 The aim is to use a solution like parallel python to call functions remotely, and be able to write a short, standalone script that contains the functions/classes that can be accessed remotely. 目的是使用类似并行python的解决方案远程调用函数,并能够编写一个包含可远程访问的函数/类的简短独立脚本。
You can get a better handle on global objects by importing __main__
, and using the methods available in that module. 通过导入__main__
并使用该模块中可用的方法,您可以更好地处理全局对象。 This is what dill does in order to serialize almost anything in python. 这就是dill为了在python中序列化几乎所有东西而做的事情。 Basically, when dill serializes an interactively defined function, it uses some name mangling on __main__
on both the serialization and deserialization side that makes __main__
a valid module. 基本上,当dill序列化交互式定义的函数时,它在序列化和反序列化方面使用__main__
上的一些名称修改,使__main__
成为有效的模块。
>>> import dill
>>>
>>> def bar(x):
... return foo(x) + x
...
>>> def foo(x):
... return x**2
...
>>> bar(3)
12
>>>
>>> _bar = dill.loads(dill.dumps(bar))
>>> _bar(3)
12
Actually, dill registers it's types into the pickle
registry, so if you have some black box code that uses pickle
and you can't really edit it, then just importing dill can magically make it work without monkeypatching the 3rd party code. 实际上,dill将它的类型注册到pickle
注册表中,所以如果你有一些黑盒子代码使用pickle
并且你无法真正编辑它,那么只需导入dill就可以神奇地使它工作而不用monkeypatching第三方代码。
Or, if you want the whole interpreter session sent over as an "python image", dill can do that too. 或者,如果您希望整个解释器会话作为“python图像”发送,莳萝也可以这样做。
>>> # continuing from above
>>> dill.dump_session('foobar.pkl')
>>>
>>> ^D
dude@sakurai>$ python
Python 2.7.5 (default, Sep 30 2013, 20:15:49)
[GCC 4.2.1 (Apple Inc. build 5566)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('foobar.pkl')
>>> _bar(3)
12
You can easily send the image across ssh to another computer, and start where you left off there as long as there's version compatibility of pickle and the usual caveats about python changing and things being installed. 您可以轻松地将图像通过ssh发送到另一台计算机,并从那里开始,只要有pickle的版本兼容性以及有关python更改和正在安装的内容的常见警告。
I actually use dill to serialize objects and send them across parallel resources with parallel python , multiprocessing, and mpi4py . 我实际上使用dill来序列化对象并通过并行python ,多处理和mpi4py将它们发送到并行资源。 I roll these up conveniently into the pathos package (and pyina for MPI), which provides a uniform map
interface for different parallel batch processing backends. 我摇这些起来方便进入感伤包(和pyina为MPI),它提供了一个均匀的map
为不同的并行批处理后端接口。
>>> # continued from above
>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> Pool(4).map(foo, range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>>
>>> from pyina.launchers import MpiPool
>>> MpiPool(4).map(foo, range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
There are also non-blocking and iterative maps as well as non-parallel pipe connections. 还有非阻塞和迭代映射以及非并行管道连接。 I also have a pathos module for pp
, however, it is somewhat unstable for functions defined in __main__
. 我也有一个pp
的pathos模块,但是,它对于__main__
定义的函数有点不稳定。 I'm working on improving that. 我正在努力改善这一点。 If you like, fork the code on github and help make the pp
better for functions defined in __main__
. 如果你愿意,可以在github上分叉代码,并帮助使pp
更好地__main__
定义的函数。 The reason pp
doesn't pickle well is that pp
does it's serialization tricks through using temporary file objects and reading the interpreter session's history... so it doesn't serialize objects in the same way that multiprocessing or mpi4py do. pp
没有好好理解的原因是pp
通过使用临时文件对象并读取解释器会话的历史来进行序列化操作......所以它不会像多处理或mpi4py那样序列化对象。 I have a dill module dill.source
that seamlessly does the same type of pickling that pp
uses, but it's rather new. 我有一个dill模块dill.source
,可以无缝地完成pp
使用的相同类型的酸洗,但它相当新。
If you are trying to pickle something so that you can use it somewhere else, separate from test_script
, that's not going to work, because pickle (apparently) just tries to load the function from the module. 如果你试图腌制某些东西,以便你可以在其他地方使用它,与test_script
分开,这是行不通的,因为pickle(显然)只是试图从模块加载函数。 Here's an example: 这是一个例子:
test_script.py test_script.py
def my_awesome_function(x, y, z):
return x + y + z
picklescript.py picklescript.py
import pickle
import test_script
with open("awesome.pickle", "wb") as f:
pickle.dump(test_script.my_awesome_function, f)
If you run python picklescript.py
, then change the filename of test_script
, when you try to load the function, it will fail. 如果您运行python picklescript.py
,然后更改的文件名test_script
,当您尝试加载功能,它会失败。 eg 例如
Running this: 运行这个:
import pickle
with open("awesome.pickle", "rb") as f:
pickle.load(f)
Will give you the following traceback: 会给你以下追溯:
Traceback (most recent call last):
File "load_pickle.py", line 3, in <module>
pickle.load(f)
File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/pickle.py", line 1378, in load
return Unpickler(file).load()
File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/pickle.py", line 1124, in find_class
__import__(module)
ImportError: No module named test_script
Pickle seems to look at the main scope for definitions of classes and functions. Pickle似乎在查看类和函数定义的主要范围。 From inside the module you're unpickling from, try this: 从你正在取消模块的内部,试试这个:
import myscript
import __main__
__main__.myclass = myscript.myclass
#unpickle anywhere after this
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.