使用多处理的脚本在 Python > 3 上使用部分和映射失败，在 2.7 上工作正常，无法pickle '_thread.lock'

Question

I used the following code until today on Python 2.7 to parallelize the creation of many PNG pictures with matplotlib .直到今天，我在 Python 2.7 上使用以下代码来并行化使用matplotlib创建许多 PNG 图片。 Today I tried to move everything on Python 3.8 and the part that I cannot adapt involves the parallelizatio done with multiprocessing .今天，我尝试在 Python 3.8 上移动所有内容，而我无法适应的部分涉及使用multiprocessing完成的并行化。

The idea is that I have a script which needs to produce several images with similar settings from different timesteps of a data file.这个想法是我有一个脚本，它需要从数据文件的不同时间步长生成多个具有相似设置的图像。 As the plotting routine can be parametrized I'm executing it over chunks of 10 timesteps distributed among different tasks to speed up the process.由于绘图例程可以参数化，我在分布在不同任务中的 10 个时间步长的块上执行它以加快过程。

Here is the relevant part of the script which I'm not going to paste given its length.这是脚本的相关部分，鉴于其长度，我不会粘贴。

from multiprocessing import Pool
from functools import partial

def main():
    # arguments to be passed to the plotting functions
    # contain data and information about the plot
    args = dict(m=m, x=x, y=y, ax=ax,
                 winds_10m=winds_10m, mslp=mslp, ....)

    # chunks of timesteps 
    dates = chunks(time, 10)
    # partial version of the function plot_files(), see underneath 
    plot_files_param = partial(plot_files, **args)
    p = Pool(8)
    p.map(plot_files_param, dates)

def plot_files(dates, **args):
    first = True
    for date in dates:
        #loop over dates, retrieve data from args, e.g. args['mslp'] and do the plotting 

if __name__ == "__main__":
    import time
    start_time = time.time()
    main()
    elapsed_time=time.time()-start_time
    print_message("script took " + time.strftime("%H:%M:%S", time.gmtime(elapsed_time)))

This used to work fine on Python 2.7 but now I get this error这曾经在 Python 2.7 上运行良好，但现在我收到此错误

Traceback (most recent call last):
  File "plot_winds10m.py", line 135, in <module>
    main()
  File "plot_winds10m.py", line 79, in main
    p.map(plot_files_param, dates)
  File "lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
  File "lib/python3.8/multiprocessing/pool.py", line 537, in _handle_tasks
    put(task)
  File "lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: cannot pickle '_thread.lock' object

the only thing that changed, besides the Python version and the packages versions, is the system.除了 Python 版本和软件包版本之外，唯一改变的是系统。 I'm testing this on MacOS instead than Linux, but it should not make a big difference especially since this is all running inside a conda environment.我正在 MacOS 上而不是在 Linux 上对此进行测试，但它应该不会有太大的不同，尤其是因为这都是在 conda 环境中运行的。

Does anyone have an idea on how to fix this?有没有人知道如何解决这个问题？

(here is the link to the github repo https://github.com/guidocioni/icon_forecasts/blob/master/plotting/plot_winds10m.py ) （这里是 github 存储库的链接https://github.com/guidocioni/icon_forecasts/blob/master/plotting/plot_winds10m.py ）

Answer 1

I figured out the problem in case anyone arrives here desperate for an answer.我想出了这个问题，以防有人急切地来到这里寻求答案。

The problem is that some of the conversion that I was doing using metpy.unit_array produce a pint array which for some reason is not pickable .的问题是，一些，我用做转换metpy.unit_array产生pint阵列，其由于某些原因不是pickable 。 When I was then passing this array in the args of the partial function I was getting the error.当我在partial函数的args中传递这个数组时，我收到了错误。

Trying instead to do the conversion with .convert_units() or just extracting the array part from the data (either with .values or .magnitude ) ensured that I was passing only a numpy array or a DataArray and these object are pickable.尝试使用.convert_units()进行转换或仅从数据中提取数组部分（使用.values或.magnitude ）确保我只传递一个numpy数组或一个DataArray并且这些对象是可选的。

使用多处理的脚本在 Python > 3 上使用部分和映射失败，在 2.7 上工作正常，无法pickle '_thread.lock'

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-09-16 07:49:49

使用多处理的脚本在 Python &gt; 3 上使用部分和映射失败，在 2.7 上工作正常，无法pickle &#39;_thread.lock&#39;

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-09-16 07:49:49

使用多处理的脚本在 Python > 3 上使用部分和映射失败，在 2.7 上工作正常，无法pickle '_thread.lock'

解决方案1
0 已采纳 2020-09-16 07:49:49