简体   繁体   中英

how to fix 'TypeError: can't pickle module objects' during multiprocessing?

I am trying to implement multiprocessing, but I am having difficulties accessing information from the object scans that I'm passing through the pool.map() function

Before multiprocessing (this works perfectly):

for sc in scans:
    my_file = scans[sc].resources['DICOM'].files[0]

After multiprocessing (does not work, error shown below):

def process(x):
    my_file = x.resources['DICOM'].files[0] 

def another_method():
    ...                
    pool = Pool(os.cpu_count())
    pool.map(process, [scans[sc] for sc in scans])

another_method()  

The error I am getting with 'After multiprocessing' code:

---> 24         pool.map(process, [scans[sc] for sc in scans])

~/opt/anaconda3/lib/python3.7/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):

~/opt/anaconda3/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
    655             return self._value
    656         else:
--> 657             raise self._value
    658 
    659     def _set(self, i, obj):

~/opt/anaconda3/lib/python3.7/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
    429                         break
    430                     try:
--> 431                         put(task)
    432                     except Exception as e:
    433                         job, idx = task[:2]

~/opt/anaconda3/lib/python3.7/multiprocessing/connection.py in send(self, obj)
    204         self._check_closed()
    205         self._check_writable()
--> 206         self._send_bytes(_ForkingPickler.dumps(obj))
    207 
    208     def recv_bytes(self, maxlength=None):

~/opt/anaconda3/lib/python3.7/multiprocessing/reduction.py in dumps(cls, obj, protocol)
     49     def dumps(cls, obj, protocol=None):
     50         buf = io.BytesIO()
---> 51         cls(buf, protocol).dump(obj)
     52         return buf.getbuffer()
     53 

TypeError: can't pickle module objects

You didn't provide the full data structure, but this might help. Multiprocessing is kinda sensible to objects... some object can't be pickled like file objects. Python: can't pickle module objects error

If you need only the file name use that in the map function instead of process

Not an expert but I got around this issue by changing a little bit the for loop.

def process(i_x):
    x = scans[i_x]
    my_file = x.resources['DICOM'].files[0] 

def another_method():
    ...      
    scans = ...      
    pool = Pool(os.cpu_count())
    pool.map(process, [i for i in range(len(scans))])

another_method() 

By doing so, the object scans is not found in the local namespace of the function process , but it will be found in the global namespace. The argument parsing uses only integers and avoids complex objects that would require Pickle to be transferred to each process. That's at least how I understand the issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM