简体   繁体   中英

random.random() generates same number in multiprocessing

I'm working on an optimization problem, and you can see a simplified version of my code posted below (the origin code is too complicated for asking such a question, and I hope my simplified code has simulated the original one as much as possible).

My purpose : use the function foo in the function optimization , but foo can take very long time due to some hard situations. So I use multiprocessing to set a time limit for execution of the function ( proc.join(iter_time) , the method is from an anwser from this question; How to limit execution time of a function call? ).

My problem :

  1. In the while loop, every time the generated values for extra are the same.
  2. The list lst 's length is always 1, which means in every iteration in the while loop it starts from an empty list.

My guess : possible reason can be each time I create a process the random seed is counting from the beginning, and each time the process is terminated, there could be some garbage collection mechanism to clean the memory the process used, so the list is cleared.

My question

  1. Anyone know the real reason of such problems?
  2. if not using multiprocessing , is there anyway else that I can realize my purpose while generate different random numbers? btw I have tried func_timeout but it has other problems that I cannot handle...
random.seed(123)
lst = []  # a global list for logging data

def foo(epoch):
    ...
    extra = random.random()
    lst.append(epoch + extra)
    ...

def optimization(loop_time, iter_time):
    start = time.time()
    epoch = 0
    while time.time() <= start + loop_time:
        proc = multiprocessing.Process(target=foo, args=(epoch,))
        proc.start()
        proc.join(iter_time)
        if proc.is_alive():  # if the process is not terminated within time limit
            print("Time out!")
            proc.terminate()

if __name__ == '__main__':
    optimization(300, 2)

You need to use shared memory if you want to share variables across processes. This is because child processes do not share their memory space with the parent. Simplest way to do this here would be to use managed lists and delete the line where you set a number seed . This is what is causing same number to be generated because all child processes will take the same seed to generate the random numbers. To get different random numbers either don't set a seed, or pass a different seed to each process:

import time, random
from multiprocessing import Manager, Process

def foo(epoch, lst):
    extra = random.random()
    lst.append(epoch + extra)

def optimization(loop_time, iter_time, lst):
    start = time.time()
    epoch = 0
    while time.time() <= start + loop_time:
        proc = Process(target=foo, args=(epoch, lst))
        proc.start()
        proc.join(iter_time)
        if proc.is_alive():  # if the process is not terminated within time limit
            print("Time out!")
            proc.terminate()
    print(lst)

if __name__ == '__main__':
    manager = Manager()
    lst = manager.list()
    optimization(10, 2, lst)

Output

[0.2035898948744943, 0.07617925389396074, 0.6416754412198231, 0.6712193790613651, 0.419777147554235, 0.732982735576982, 0.7137712131028766, 0.22875414425414997, 0.3181113880578589, 0.5613367673646847, 0.8699685474084119, 0.9005359611195111, 0.23695341111251134, 0.05994288664062197, 0.2306562314450149, 0.15575356275408125, 0.07435292814989103, 0.8542361251850187, 0.13139055891993145, 0.5015152768477814, 0.19864873743952582, 0.2313646288041601, 0.28992667535697736, 0.6265055915510219, 0.7265797043535446, 0.9202923318284002, 0.6321511834038631, 0.6728367262605407, 0.6586979597202935, 0.1309226720786667, 0.563889613032526, 0.389358766191921, 0.37260564565714316, 0.24684684162272597, 0.5982042933298861, 0.896663326233504, 0.7884030244369596, 0.6202229004466849, 0.4417549843477827, 0.37304274232635715, 0.5442716244427301, 0.9915536257041505, 0.46278512685707873, 0.4868394190894778, 0.2133187095154937]

Keep in mind that using managers will affect performance of your code. Alternate to this, you could also use multiprocessing.Array , which is faster than managers but is less flexible in what data it can store, or Queues as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM