与工人池一起处理并行任务

Question

我是 Python 的新人，正在尝试了解多处理。

我正在尝试运行以下简单脚本

import multiprocessing
from multiprocessing import Pool

def f(x):
    print(f'PID {multiprocessing.current_process().pid}')
    return x*x

if __name__ == '__main__':
    with Pool(processes=4) as pool:         # start 4 worker processes
        print(pool.map(f, range(10)))       # prints "[0, 1, 4,..., 81]"

我期望看到带有 4 个不同 PID 的打印，因为我使用 processes=4 启动了池。

但是，我看到所有打印件都具有相同的 PID。

output：

PID 7412

PID 7412

PID 7412

PID 7412

PID 7412

PID 7412

PID 7412

PID 7412

PID 7412

PID 7412

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

我对游泳池的理解缺少什么？

Answer 1

这似乎是由于执行任务所花费的时间少于不同工作进程开始接受工作所花费的时间。 这似乎是理想的行为，因为多处理Pool的目的是尽快分派工作，如果一个进程首先准备好，那么将所有东西分派给它似乎是合理的。

我看到这种情况在生成进程而不是分叉时更可靠地发生（Linux 默认使用 fork，Windows 只能生成工人）。

通过改变sleep和start_method可以看到这些效果：

from multiprocessing import Pool, current_process, set_start_method
from collections import Counter
from time import sleep
from random import random

# force some processes to take longer to start
sleep(random() * 0.1)

def f(_):
    sleep(0.01) # make the task take >10ms to complete
    return current_process().pid

if __name__ == '__main__':
    # force workers to be spawned rather than forked under Unix OSs
    set_start_method("spawn")
    with Pool(processes=4) as pool:
        print(Counter(pool.map(f, range(10))))

在 Linux 5.19 下，通过分叉创建进程至少比生成进程快 30 倍。 你在顶层做的工作越多，它就会变得越糟，仅仅import numpy使产生 100 倍更糟。

为了查看启动方法的区别，我使用了以下方法：

# import numpy as np

def main():
    # import in function to minimise spawn time, annoyingly this doesn't have
    # much noticable effect
    from multiprocessing import Process, set_start_method
    from sys import argv
    from time import monotonic

    set_start_method(argv[1])

    # run it a few times, the first run is always worse
    results = []
    for _ in range(10):
        t0 = monotonic()
        p = Process()
        p.start()
        p.join()
        results.append(monotonic() - t0)

    for dt in results:
        print(f"{dt*1000:.3f}ms")

if __name__ == "__main__":
    main()

作为参考，我笔记本电脑上 Python 3.10.7 和 Linux 5.19.11 的结果是：

> echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
> python fork_vf_spawn.py spawn | tail -3
34.456ms
34.562ms
34.089ms
> python fork_vf_spawn.py fork | tail -3
1.074ms
1.258ms
1.133ms

更改 CPU 频率调节器有助于最大程度地减少因响应工作负载变化而动态变化引起的差异。 取消注释import numpy行会导致 spawn 上升到 ~110ms。

与工人池一起处理并行任务

问题描述

1 个解决方案

解决方案1
1 2022-09-28 10:12:54

与工人池一起处理并行任务

问题描述

1 个解决方案

解决方案1 1 2022-09-28 10:12:54

解决方案1
1 2022-09-28 10:12:54