[英]Error pickling a `matlab` object in joblib `Parallel` context
I'm running some Matlab code in parallel from inside a Python context (I know, but that's what's going on), and I'm hitting an import error involving matlab.double
. 我正在从Python上下文中并行运行一些Matlab代码(我知道,但这就是发生的事情),并且遇到了涉及
matlab.double
的导入错误。 The same code works fine in a multiprocessing.Pool
, so I am having trouble figuring out what the problem is. 相同的代码在
multiprocessing.Pool
可以正常工作,所以我很难弄清楚问题出在哪里。 Here's a minimal reproducing test case. 这是一个最小的再现测试用例。
import matlab
from multiprocessing import Pool
from joblib import Parallel, delayed
# A global object that I would like to be available in the parallel subroutine
x = matlab.double([[0.0]])
def f(i):
print(i, x)
with Pool(4) as p:
p.map(f, range(10))
# This prints 1, [[0.0]]\n2, [[0.0]]\n... as expected
for _ in Parallel(4, backend='multiprocessing')(delayed(f)(i) for i in range(10)):
pass
# This also prints 1, [[0.0]]\n2, [[0.0]]\n... as expected
# Now run with default `backend='loky'`
for _ in Parallel(4)(delayed(f)(i) for i in range(10)):
pass
# ^ this crashes.
So, the only problematic one is the one using the 'loky'
backend. 因此,唯一有问题的是使用
'loky'
后端的。 The full traceback is: 完整的回溯是:
exception calling callback for <Future at 0x7f63b5a57358 state=finished raised BrokenProcessPool>
joblib.externals.loky.process_executor._RemoteTraceback:
'''
Traceback (most recent call last):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "~/miniconda3/envs/myenv/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/mlarray.py", line 31, in <module>
from _internal.mlarray_sequence import _MLArrayMetaClass
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_sequence.py", line 3, in <module>
from _internal.mlarray_utils import _get_strides, _get_size, \
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_utils.py", line 4, in <module>
import matlab
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/__init__.py", line 24, in <module>
from mlarray import double, single, uint8, int8, uint16, \
ImportError: cannot import name 'double'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
callback(self)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 309, in __call__
self.parallel.dispatch_next()
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 731, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 1022, in submit
raise self._flags.broken
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
joblib.externals.loky.process_executor._RemoteTraceback:
'''
Traceback (most recent call last):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "~/miniconda3/envs/myenv/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/mlarray.py", line 31, in <module>
from _internal.mlarray_sequence import _MLArrayMetaClass
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_sequence.py", line 3, in <module>
from _internal.mlarray_utils import _get_strides, _get_size, \
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_utils.py", line 4, in <module>
import matlab
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/__init__.py", line 24, in <module>
from mlarray import double, single, uint8, int8, uint16, \
ImportError: cannot import name 'double'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test.py", line 20, in <module>
for _ in Parallel(4)(delayed(f)(i) for i in range(10)):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
self.retrieve()
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 521, in wrap_future_result
return future.result(timeout=timeout)
File "~/miniconda3/envs/myenv/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "~/miniconda3/envs/myenv/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
callback(self)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 309, in __call__
self.parallel.dispatch_next()
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 731, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 1022, in submit
raise self._flags.broken
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
Looking at the traceback, it seems like the root cause is an issue importing the matlab
package in the child process. 查看回溯,似乎根本原因是在子进程中导入
matlab
软件包时出现问题。
It's probably worth noting that this all runs just fine if instead I had defined x = np.array([[0.0]])
(after importing numpy as np
). 可能值得注意的是,如果我定义了
x = np.array([[0.0]])
(在将numpy as np
导入numpy as np
之后x = np.array([[0.0]])
,则所有这些都可以正常运行。 And of course the main process has no problem with any matlab
imports, so I am not sure why the child process would. 当然,主过程对于任何
matlab
导入都没有问题,因此我不确定子进程为什么会这样做。
I'm not sure if this error has anything in particular to do with the matlab
package, or if it's something to do with global variables and cloudpickle
or loky
. 我不确定此错误是否与
matlab
软件包特别相关,还是与全局变量和cloudpickle
或loky
。 In my application it would help to stick with loky
, so I'd appreciate any insight! 在我的应用程序中,坚持使用
loky
会有所帮助,因此,我将不胜感激!
I should also note that I'm using the official Matlab engine for Python: https://www.mathworks.com/help/matlab/matlab-engine-for-python.html . 我还应注意,我正在使用Python的官方Matlab引擎: https : //www.mathworks.com/help/matlab/matlab-engine-for-python.html 。 I suppose that might make it hard for others to try out the test cases, so I wish I could reproduce this error with a type other than
matlab.double
, but I haven't found another yet. 我想这可能会使其他人很难尝试测试用例,因此我希望我可以使用除
matlab.double
其他类型来重现此错误,但我还没有找到其他错误。
Digging around more, I've noticed that the process of importing the matlab
package is more circular than I would expect, and I'm speculating that this could be part of the problem? 深入研究,我注意到导入
matlab
程序包的过程比我预期的要循环得多,而且我推测这可能是问题的一部分? The issue is that when import matlab
is run by loky
's _ForkingPickler
, first some file matlab/mlarray.py
is imported, which imports some other files, one of which contains import matlab
, and this causes matlab/__init__.py
to be run, which internally has from mlarray import double, single, uint8, ...
which is the line that causes the crash. 问题是,当由
loky
的_ForkingPickler
运行import matlab
时,首先导入了一些文件matlab/mlarray.py
,该文件又导入了一些其他文件,其中一个文件包含import matlab
,这将导致运行matlab/__init__.py
,它在内部from mlarray import double, single, uint8, ...
这是导致崩溃的行。
Could this circularity be the issue? 难道这就是问题所在吗? If so, why can I import this module in the main process but not in the
loky
backend? 如果是这样,为什么我可以在主进程中导入此模块,而不能在
loky
后端中loky
?
The error is caused by incorrect loading order of global objects in the child processes. 该错误是由于子进程中全局对象的加载顺序错误引起的。 It can be seen clearly in the traceback
_ForkingPickler.loads(res) -> ... -> import matlab -> from mlarray import ...
that matlab
is not yet imported when the global variable x
is loaded by cloudpickle
. 它可以清楚地在回溯中可以看出
_ForkingPickler.loads(res) -> ... -> import matlab -> from mlarray import ...
那matlab
尚未导入时的全局变量x
被加载cloudpickle
。
joblib
with loky
seems to treat modules as normal global objects and send them dynamically to the child processes. joblib
与loky
似乎对待模块正常全局对象,并动态将它们发送到子进程。 joblib doesn't record the order in which those objects/modules were defined. joblib不记录定义这些对象/模块的顺序。 Therefore they are loaded (initialized) in a random order in the child processes.
因此,它们在子进程中以随机顺序加载(初始化)。
A simple workaround is to manually pickle the matlab object and load it after importing matlab inside your function. 一个简单的解决方法是在函数中导入matlab之后手动对matlab对象进行腌制并加载它。
import matlab
import pickle
px = pickle.dumps(matlab.double([[0.0]]))
def f(i):
import matlab
x=pickle.loads(px)
print(i, x)
Of course you can also use the joblib.dumps and loads
to serialize the objects. 当然,您也可以使用joblib.dumps和
loads
来序列化对象。
Thanks to the suggestion of @Aaron, you can also use an initializer
( for loky ) to import Matlab before loading x
. 由于@Aaron的建议,您还可以在加载
x
之前使用initializer
( 用于loky )导入Matlab。
Currently there's no simple API to specify initializer
. 当前没有简单的API可以指定
initializer
。 So I wrote a simple function: 所以我写了一个简单的函数:
def with_initializer(self, f_init):
# Overwrite initializer hook in the Loky ProcessPoolExecutor
# https://github.com/tomMoral/loky/blob/f4739e123acb711781e46581d5ed31ed8201c7a9/loky/process_executor.py#L850
hasattr(self._backend, '_workers') or self.__enter__()
origin_init = self._backend._workers._initializer
def new_init():
origin_init()
f_init()
self._backend._workers._initializer = new_init if callable(origin_init) else f_init
return self
It is a little bit hacky but works well with the current version of joblib and loky. 它有点hacky,但是可以与当前版本的joblib和loky一起很好地工作。 Then you can use it like:
然后您可以像这样使用它:
import matlab
from joblib import Parallel, delayed
x = matlab.double([[0.0]])
def f(i):
print(i, x)
def _init_matlab():
import matlab
with Parallel(4) as p:
for _ in with_initializer(p, _init_matlab)(delayed(f)(i) for i in range(10)):
pass
I hope the developers of joblib will add initializer
argument to the constructor of Parallel
in the future. 我希望joblib的开发人员将来可以将
initializer
参数添加到Parallel
的构造函数中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.