from multiprocessing import Pool, cpu_count
import numpy as np
from numpy.random import multivariate_normal
F = multivariate_normal(np.zeros(3), np.eye(3), (3, 5))
def test(k):
print(k)
res = np.zeros((5, 3))
for i in range(3):
res[:, i] = F[k, :, i]
#print(res[:, i])
return res
if __name__ == '__main__':
with Pool(cpu_count()) as pool:
result = pool.map(test, range(3))
pool.close()
pool.join()
result = np.array(results)
In python3.6, the result is equal to the random matrix F. But their two matrices are different in python 3.8. This is just an example. In the real code, I want to pick up each column of F in each time step and do some operations on it.
Psychic debugging: You're running on Windows (any Python version) or macOS (Python 3.8 or later), both of which default to the 'spawn'
method of making worker processes, rather than 'fork'
. When that happens, the __main__
module is imported (with a different name, so it doesn't try to run the main code again) in the child process to simulate a fork
.
This mostly works, but it fails badly in the case of self-seeding PRNGs, because they reseed in the child process, and the globals are regenerated from that new PRNG, rather than having their generated values inherited by the child.
In short, the ways to make this work are:
'fork'
as the multiprocessing
start method (not possible on Windows, technically allowed on macOS, but could break, it's why they changed the default to 'spawn'
). I tested your code on Linux (where it fork
s by default), and aside from fixing a typo (you typed results
in one place where it should be result
), result
and F
are the same there. When I add multiprocessing.set_start_method('spawn')
just before creating the Pool
, they don't match.Pool
so each worker resets F
to the value seen in the parentF
so it's consistent no matter the process (downside: it'll be the same every run, or at least, very predictable, depending how clever you try to get) Note that #2 and #3 can be combined to minimize dataflow. Generate a "real" random seed in the parent (eg with os.urandom
) and write a simple function that accepts a seed and uses it to both seed the PRNG and generate F
(using global F
to let it change the global value). Call that function in the parent and pass it as the initializer with the seed argument to each child. Now, instead of passing the generated value of F
(potentially huge), you only need to pass the seed, and the child process can reproduce F
locally without needing to serialize the whole thing. Downside: All processes share the same random seed; it's not predictable like a hardcoded seed, but parents and children will be drawing from an identical set of random numbers.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.