简体   繁体   中英

Pickling error when using python multiprocessing with class functions

I have three scripts, scheduler.py which is a parallel task runner based on multiprocessing.Process and multiprocessing.Pipe , and the second script is simulation.pyx which is a script containing some classes and functions that I want to perform in parallel via scheduler.py and lastly a small main script where I create an instance of the parallelization class from scheduler.py , pass it to the classes in simulation.pyx and run the whole thing.

When the target parallel function is on the top level in simulation.pyx everything works fine, but as soon as I try to use scheduler.py with a class function in simulation.pyx I get a pickling error.

Since the code is several thousand of lines I'll only give some conceptual code:

small_main_script.py :

import simulation
import scheduler


if __name__ == '__main__':

    main = simulation.Main()
    scheduler = scheduler.parallel()
    main.simulate(scheduler)



simulation.pyx :

import scheduler

cdef do_something_with_job(job):
...

cdef class Main:
    cdef public ...
    ...

    def __init__(self):
    ...

    def some_function(self,job):
        ...
        do_something_with_job(job)
        ...

    def simulate(self, scheduler):

        for job in job_list:
            scheduler.add_jobs(job)

        scheduler.target_function = self.some_function

        scheduler.run_in_parallel()

The thing is that if I use useless dummy function like

def sleep(job):
    time.sleep(2)

and put it on the top level ie outside the classes, the parallelization works fine but as soon as i put it inside the class Main i get a pickling error. I get the same error if I use my real target function which is also defined in the class Main and I don't want to move it to the top level. The following is what happens when I use the dummy function sleep(self,job) inside the class Main . When it's outside the class it works fine.

PicklingError: Can't pickle <built-in method sleep of simulation.Main
object at 0x0D4A3C00>: it's not found as __main__.sleep

In [2]: Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Python27\lib\multiprocessing\forking.py", line 381, in main self = load(from_parent)
  File "C:\Python27\lib\pickle.py", line 1384, in load return Unpickler(file).load()
  File "C:\Python27\lib\pickle.py", line 864, in load dispatch[key](self)
  File "C:\Python27\lib\pickle.py", line 886, in load_eof

    raise EOFError
EOFError

I'm using Python 2.7

Update

I have managed to further isolate the problem. When using third party package pathos multiprocessing I'm able to pickle class functions. The problem now seems to be that I get an error when using function arguments that are class instances.

From Python multiprocessing programming guidelines :

Picklability: Ensure that the arguments to the methods of proxies are picklable.

Only top level functions are picklable .

The reason why it is hard to pickle non top level functions (class/instance methods, nested functions etc) is because it is hard to look them up in a portable manner in the child process. The process you are sending the instance method to execute might not have any idea about the object which owns the method itself.

As the programming guidelines suggest:

However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.

In other words, create a process passing the method to the target keyword.

Pathos library extends the pickle protocol allowing to serialise more types than the standard protocol supports.

In general it is not recommended to mix OOP and multiprocessing as there are several corner cases which can be misleading. This is one of them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM