import multiprocessing
# list with global scope
result = [100,200]
def square_list(mylist):
"""
function to square a given list
"""
global result
# append squares of mylist to global list result
for num in mylist:
result.append(num * num)
# print global list result
print("Result(in process p1): {}".format(result))
if __name__ == "__main__":
# input list
mylist = [1,2,3,4]
# creating new process
p1 = multiprocessing.Process(target=square_list, args=(mylist,))
# starting process
p1.start()
# wait until process is finished
p1.join()
# print global result list
print("Result(in main program): {}".format(result))
Here, the global variable result
can be accessed by the function that is running in a new process. Since the new process has its own python interpreter and its own memory space, how can it access the global variable from the parent process?
Note: I understand about concept of queue/pipe/manager/array/value. This question is specifically to ask how does child process have READ access to global variable from parent process?
As I mentioned in my comment to your question, you should use a managed list that is passed as an additional argument to square_list
:
import multiprocessing
def square_list(result, mylist):
"""
function to square a given list
"""
# append squares of mylist to global list result
for num in mylist:
result.append(num * num)
# print global list result
print("Result(in process p1): {}".format(result))
if __name__ == "__main__":
# input list
mylist = [1,2,3,4]
result = multiprocessing.Manager().list([100,200])
# creating new process
p1 = multiprocessing.Process(target=square_list, args=(result, mylist))
# starting process
p1.start()
# wait until process is finished
p1.join()
# print global result list
print("Result(in main program): {}".format(result))
Prints:
Result(in process p1): [100, 200, 1, 4, 9, 16]
Result(in main program): [100, 200, 1, 4, 9, 16]
Notes
If your subprocess ("child process") were only reading the result
list, then your code would be fine as is. But things get a bit complicated when you want to update the list and have it reflected back to the main process.
There are two ways a subprocess can update an object that has been created by a main process (I will ultimately get to the issue of the object actually being a global variable):
Let's take the case of a simple shared memory object being updated using a global variable. For this I will use a simple multiprocessing.Value
instance to create a shared integer:
import multiprocessing
v = multiprocessing.Value('i', 1) # initialize to 1
def worker():
v.value += 10
if __name__ == "__main__":
p = multiprocessing.Process(target=worker)
p.start()
p.join()
print(v.value)
On Windows, this prints as 1
and not 11
as you might expect. This is because on Windows new processes are created using the spawn
method. This means a new, empty address space is created, a new Python interpreter is launched and the source is re-executed from the top and any code at global scope is executed except code within the if __name__ == "__main__":
block since in the newly minted process __name__
will not be "__main__"
(and that is a good thing else otherwise you would get into a recursive loop re-creating new subprocesses).
But this means that the subprocess has just created its own instance of global variable v
. So for this to work, v
cannot be global and must be passed as an argument to worker
.
But, there is a way if instead you use a multiprocessing pool. This facility allows you to initialize each process in the pool with a special pool-initializer function:
import multiprocessing
# initialize each process (there is only 1) in the pool
def init_pool(shared_v):
global v
v = shared_v # v is global
def worker():
v.value += 10
if __name__ == "__main__":
v = multiprocessing.Value('i', 1) # I am global
# create pool of size 1:
pool = multiprocessing.Pool(1, initializer=init_pool, initargs=(v,))
pool.apply(worker)
Prints:
11
Unfortunately, it is a bit of work to implement a list using the shared-memory data types available. That is why I recommended using a managed list:
import multiprocessing
# initialize each process (there is only 1) in the pool
def init_pool(shared_result):
global result
result = shared_result # result is global
def square_list(mylist):
"""
function to square a given list
"""
# append squares of mylist to global list result
for num in mylist:
result.append(num * num)
# print global list result
print("Result(in process p1): {}".format(result))
if __name__ == "__main__":
# input list
mylist = [1,2,3,4]
result = multiprocessing.Manager().list([100,200])
pool = multiprocessing.Pool(1, initializer=init_pool, initargs=(result,))
pool.apply(square_list, args=(mylist,))
# print global result list
print("Result(in main program): {}".format(result))
Result(in process p1): [100, 200, 1, 4, 9, 16]
Result(in main program): [100, 200, 1, 4, 9, 16]
This technique works in Windows, Linux, etc., ie all platforms.
Move your updatable global variables to within the if __name__ == '__main__':
block (they are still global to the main process) and use a pool-initializer function to initialize the pool processes with these variables. In fact, for platforms that use spawn
, you should consider moving all global definitions that are not required by subprocesses and are expensive to create to within the if __name__ == '__main__':
block.
One of the key tenants of processes vs threads is that they don't share memory. There are a few things that exist to actually share memory, but in general with processes, you should pass messages via queues, pipes, etc..
Here is an example of passing a return value back to the parent via a queue:
import multiprocessing
# list with global scope
result = [100,200] #result is re-created on import here in the child process
def square_list(mylist, ret_q):
"""
function to square a given list
"""
global result
# append squares of mylist to global list result
for num in mylist:
result.append(num * num)
# print global list result
print("Result(in process p1): {}".format(result))
ret_q.put(result) #send the modified result to the main process
if __name__ == "__main__":
# input list
mylist = [1,2,3,4]
#return queue
ret_q = multiprocessing.Queue()
# creating new process
p1 = multiprocessing.Process(target=square_list, args=(mylist, ret_q))
# starting process
p1.start()
# wait for the result
result = ret_q.get()
# wait until process is finished
p1.join()
# print global result list
print("Result(in main program): {}".format(result))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.