简体   繁体   中英

How can I share variables between processes in Python 3?

I am using Python 3.9.6 x64 on Windows 10, and I want to know how can I share variables between sub-processes.

Why? Because I am writing a asynchronous multi-connection resumable downloader using requests, and by default requests enables keep-alive , this leaves a hanging connection of dead requests and often causes server to send extra bytes from previous dead connections upon resuming, I have tried r.close() , r.connection.close() , s.close() and s.headers['connection'] = 'close' , all of them have failed to solve the problem while using threading.Thread .

However I have never ever encountered a scenario where extra bytes from previous process were received, and all downloads without pausing were successful (that is, if the connection didn't die mid-download, which happens rather often), so I think the connections are guaranteed to be killed if the corresponding processes are killed, so I am looking for a solution using multiprocessing.Process (I know downloading is I/O bound not CPU bound etc., however threads don't receive new pids), but I don't know how to share variables between processes...

Specifically I want to share two objects:

1, a mmap object that stores the downloaded data.

2, a dictionary created using this:

self.progress['total'] = total
self.progress['connections'] = num_connections
for i in range(num_connections):
    ...
    self.progress[i] = dict()
    self.progress[i]['start'] = start
    self.progress[i]['position'] = start
    self.progress[i]['end'] = end
    self.progress[i]['count'] = 0
    self.progress[i]['length'] = length
    self.progress[i]['completed'] = False

All variables are integers and the above code only demonstrates how the dict is created.

I have Google searched python share variable between processes and this time Google did find relevant results, from this site, however all the answers are for Python 2 and won't work on my machine.

For example this answer: https://stackoverflow.com/a/17393879/16383578

After reformatting to Python 3, I have tried to run it, and...

In [1]: import time
   ...: from multiprocessing import Process, Manager, Value
   ...:
   ...: def foo(data, name=''):
   ...:     print(type(data), data.value, name)
   ...:     data.value += 1
   ...:
   ...: if __name__ == "__main__":
   ...:     manager = Manager()
   ...:     x = manager.Value('i', 0)
   ...:     y = Value('i', 0)
   ...:
   ...:     for i in range(5):
   ...:         Process(target=foo, args=(x, 'x')).start()
   ...:         Process(target=foo, args=(y, 'y')).start()
   ...:
   ...:     print('Before waiting: ')
   ...:     print('x = {0}'.format(x.value))
   ...:     print('y = {0}'.format(y.value))
   ...:
   ...:     time.sleep(5.0)
   ...:     print('After waiting: ')
   ...:     print('x = {0}'.format(x.value))
   ...:     print('y = {0}'.format(y.value))
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\program files\python39\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "c:\program files\python39\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
---------------------------------------------------------------------------
PicklingError                             Traceback (most recent call last)
<ipython-input-1-779bd728820e> in <module>
     12
     13     for i in range(5):
---> 14         Process(target=foo, args=(x, 'x')).start()
     15         Process(target=foo, args=(y, 'y')).start()
     16

c:\program files\python39\lib\multiprocessing\process.py in start(self)
    119                'daemonic processes are not allowed to have children'
    120         _cleanup()
--> 121         self._popen = self._Popen(self)
    122         self._sentinel = self._popen.sentinel
    123         # Avoid a refcycle if the target function holds an indirect

c:\program files\python39\lib\multiprocessing\context.py in _Popen(process_obj)
    222     @staticmethod
    223     def _Popen(process_obj):
--> 224         return _default_context.get_context().Process._Popen(process_obj)
    225
    226 class DefaultContext(BaseContext):

c:\program files\python39\lib\multiprocessing\context.py in _Popen(process_obj)
    325         def _Popen(process_obj):
    326             from .popen_spawn_win32 import Popen
--> 327             return Popen(process_obj)
    328
    329     class SpawnContext(BaseContext):

c:\program files\python39\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     91             try:
     92                 reduction.dump(prep_data, to_child)
---> 93                 reduction.dump(process_obj, to_child)
     94             finally:
     95                 set_spawning_popen(None)

c:\program files\python39\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61
     62 #

PicklingError: Can't pickle <function foo at 0x000001E1AFB4AC10>: attribute lookup foo on __main__ failed

How can I actually start new processes and share variables between them?


Hmm, I don't understand, the above code was run in IPython shell, the error was generated, however when running as a script, it works fine without errors, as far as I understand, __name__ variable in both cases is '__main__' , why they behave differently?

Does pickle work in Python interpreter? It seems that pickle doesn't: https://docs.python.org/3/library/pickle.html

What can be pickled and unpickled?

The following types can be pickled:

 None, True, and False integers, floating point numbers, complex numbers strings, bytes, bytearrays tuples, lists, sets, and dictionaries containing only picklable objects functions defined at the top level of a module (using def, not lambda) built-in functions defined at the top level of a module classes that are defined at the top level of a module instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section Pickling Class Instances for details).

However in this error in Python 3.9.6 interpreter it clearly states that the "shell process session" (don't know what it's called) is a module:

AttributeError: Can't get attribute 'foo' on <module '__main__' (built-in)>

Does pickle work in Python interpreter? If it doesn't, then why?

I have tried to Google this and not surprisingly failed to find anything useful yet again.


Nope, pickle.dumps() and pickle.loads() works fine, and reduction.pickle points to the pickle module:

>>> from multiprocessing.spawn import reduction
>>> reduction.pickle
<module 'pickle' from 'C:\\Program Files\\Python39\\lib\\pickle.py'>

So the error really is AttributeError: Can't get attribute 'foo' on <module '__main__' (built-in)> , so it seems when running in a script, foo is an attribute of __main__ , but when running in shell it isn't so, but why?

Also, if I type __name__ , Python returns '__main__' which is a str , however when I type foo in IPython, it shows:

<function __main__.foo(data, name='')>

This seems to suggest __main__ is a variable, however when I type __main__ :

In [8]: __main__
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-19bc6e7e07cd> in <module>
----> 1 __main__

NameError: name '__main__' is not defined

So what exactly is __main__ ? I guess it is the current global namespace, if so, how can I access it in shell (what is its variable's name)?


It seems multiprocessing doesn't work in Python shell, but I can't confirm this because Google failed to find anything relevant...


Just changed the two occurrences of .start() to .run() and the above code magically worked in Python shell, why Process.start() is different from Process.run() ?


So multiprocessing really doesn't work in Python shell?


This thing is really complicated, I have installed Windows 10 21H1 on VMWare Workstation 16 Player, then installed Python 3.9.6 x64 on it just to test the code.

I typed the code line by line and at the start of the processes the same errors occurred (I don't know how to copy paste between host machine and guest machine yet):

在此处输入图像描述

So either I am extremely unlucky, or Python 3.9.6 for Windows x64 is bugged...

I've replaced my environment as best as I can (Windows, Python v3.9.6) but unable to reproduce your error. The multiprocessing code works fine & able to share the objects between processes using multiprocessing.Manager and also multiprocessing.Value .

This issue usually happens because - only top-level functions/statements can be pickled.

I suggest you to try pathos.multiprocessing , a fork of multiprocessing that uses dill instead of pickle. dill can serialize almost anything in python.

import time
from pathos.helpers import mp

def foo(data, name=''):
    print(type(data), data.value, name)
    data.value += 1

if __name__ == "__main__":

    manager = mp.Manager()
    x = manager.Value('i', 0)
    y = mp.Value('i', 0)

    for i in range(3):
        mp.Process(target=foo, args=(x, 'x')).start()
        mp.Process(target=foo, args=(y, 'y')).start()

        print('Before waiting: ')
        print('x = {0}'.format(x.value))
        print('y = {0}'.format(y.value))

        time.sleep(3)
        print('After waiting: ')
        print('x = {0}'.format(x.value))
        print('y = {0}'.format(y.value))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM