简体   繁体   English

Python:并行执行内部具有顺序循环的函数

[英]Python: parallel execution of a function which has a sequential loop inside

I am reproducing some simple 10-arm bandit experiments from Sutton and Barto's book Reinforcement Learning: An Introduction .我正在从 Sutton 和 Barto 的书Reinforcement Learning: An Introduction 中复制一些简单的 10 臂老虎机实验。 Some of these require significant computation time so I tried to get the advantage of my multicore CPU.其中一些需要大量的计算时间,因此我试图利用多核 CPU 的优势。

Here is the function which i need to run 2000 times.这是我需要运行 2000 次的函数。 It has 1000 sequential steps which incrementally improve the reward:它有 1000 个连续步骤,可逐步提高奖励:

import numpy as np

def foo(eps): # need an (unused) argument to use pool.map()
    # initialising
    # the true values of the actions
    q = np.random.normal(0, 1, size=10)
    # the estimated values
    q_est = np.zeros(10)
    # the counter of how many times each of the 10 actions was chosen
    n = np.zeros(10)

    rewards = []
    for i in range(1000):
        # choose an action based on its estimated value
        a = np.argmax(q_est)
        # get the normally distributed reward 
        rewards.append(np.random.normal(q[a], 1)) 
        # increment the chosen action counter
        n[a] += 1 
        # update the estimated value of the action
        q_est[a] += (rewards[-1] - q_est[a]) / n[a] 
    return rewards

I execute this function 2000 times to get (2000, 1000) array:我执行这个函数 2000 次来得到 (2000, 1000) 数组:

reward = np.array([foo(0) for _ in range(2000)])

Then I plot the mean reward across 2000 experiments:然后我绘制了 2000 次实验的平均奖励:

import matplotlib.pyplot as plt
plt.plot(np.arange(1000), reward.mean(axis=0))

sequential plot顺序图

which fully corresponds the expected result (looks the same as in the book).这完全符合预期的结果(看起来和书中的一样)。 But when I try to execute it in parallel, I get much greater standard deviation of the average reward:但是当我尝试并行执行它时,我得到的平均奖励的标准偏差要大得多:

import multiprocessing as mp
with mp.Pool(mp.cpu_count()) as pool:
    reward_p = np.array(pool.map(foo, [0]*2000))
plt.plot(np.arange(1000), reward_p.mean(axis=0))

parallel plot平行图

I suppose this is due to the parallelization of a loop inside of the foo.我想这是由于 foo 内部循环的并行化。 As i reduce the number of cores allocated to the task, the reward plot approaches the expected shape.当我减少分配给任务的内核数量时,奖励图接近预期的形状。

Is there a way to get the advantage of the multiprocessing here while getting the correct results?有没有办法在获得正确结果的同时获得多处理的优势?

UPD: I tried running the same code on Windows 10 and sequential vs parallel and the results turned out to be the same! UPD:我尝试在 Windows 10 和顺序与并行上运行相同的代码,结果是一样的! What may be the reason?可能是什么原因?

Ubuntu 20.04, Python 3.8.5, jupyter Ubuntu 20.04、Python 3.8.5、jupyter

Windows 10, Python 3.7.3, jupyter Windows 10、Python 3.7.3、jupyter

As we found out it is different on windows and ubuntu.我们发现它在 windows 和 ubuntu 上是不同的。 It is probably because of this:大概是因为这个:

spawn The parent process starts a fresh python interpreter process. spawn 父进程启动一个新的python解释器进程。 The child process will only inherit those resources necessary to run the process objects run() method.子进程将只继承运行进程对象 run() 方法所需的资源。 In particular, unnecessary file descriptors and handles from the parent process will not be inherited.特别是,父进程中不必要的文件描述符和句柄将不会被继承。 Starting a process using this method is rather slow compared to using fork or forkserver.与使用 fork 或 forkserver 相比,使用此方法启动进程相当慢。

Available on Unix and Windows.在 Unix 和 Windows 上可用。 The default on Windows and macOS. Windows 和 macOS 上的默认设置。

fork The parent process uses os.fork() to fork the Python interpreter. fork 父进程使用 os.fork() 来 fork Python 解释器。 The child process, when it begins, is effectively identical to the parent process.子进程在开始时实际上与父进程相同。 All resources of the parent are inherited by the child process.父进程的所有资源都由子进程继承。 Note that safely forking a multithreaded process is problematic.请注意,安全地分叉多线程进程是有问题的。

Available on Unix only.仅在 Unix 上可用。 The default on Unix. Unix 上的默认设置。

Try adding this line to your code:尝试将此行添加到您的代码中:

mp.set_start_method('spawn')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM