Python根据可用RAM的数量和函数中的参数动态控制多处理脚本中的进程数

Question

我对python有一个不寻常的问题。 我正在使用multiprocessing库来映射函数f((dynamic1, dynamic2), fix1, fix2) 。

import multiprocessing as mp

fix1 = 4
fix2 = 6

# Numer of cores to use
N = 6

dynamic_duos = [(a, b) for a in range(5) for b in range(10)]

with mp.Pool(processes = N) as p:
    p.starmap(f, [(dyn, fix1, fix2) for dyn in dynamic_duos])

我想动态控制活动进程的数量，因为该函数实际上有时会抽空很多RAM。 如果sum(dyn)低于阈值并且RAM的量高于阈值，则想法是在每次迭代（即在函数f任何调用之前）检查。 如果条件匹配，则可以启动新进程并计算该函数。

另一个条件是最大进程数：PC上的核心数。

谢谢您的帮助：）

编辑：有关原因的详细信息。

某些参数组合的RAM消耗较高（1个进程最高可达80 Gb）。 我知道哪些会使用大量的RAM，当程序遇到它们时，我想等待另一个进程结束，在单个进程中启动这个高RAM消耗组合，然后用更多的内容恢复计算处理组合的其余部分以进行映射。

根据以下答案编辑我的尝试：

它不起作用，但它不会引发错误。 它只是完成了程序。

# Imports
import itertools
import concurrent.futures

# Parameters
N = int(input("Number of CPUs to use: "))
t0 = 0
tf = 200
s_step = 0.05
max_s = None
folder = "test"

possible_dynamics = [My_class(x) for x in [20, 30, 40, 50, 60]]
dynamics_to_compute = [list(x) for x in itertools.combinations_with_replacement(possible_dynamics , 2)] + [list(x) for x in itertools.combinations_with_replacement(possible_dynamics , 3)]

function_inputs = [(dyn , t0, tf, s_step, max_s, folder) for dyn in dynamics_to_compute]

# -----------
# Computation
# -----------
start = time.time()

# Pool creation and computation
futures = []
pool = concurrent.futures.ProcessPoolExecutor(max_workers = N)

for Obj, t0, tf, s_step, max_s, folder in function_inputs:
    if large_memory(Obj, s_step, max_s):
        concurrent.futures.wait(futures)  # wait for all pending tasks
        large_future = pool.submit(compute, Obj, t0, tf, 
                             s_step, max_s, folder)
        large_future.result()  # wait for large computation to finish
    else:
        future = pool.submit(compute, Obj, t0, tf, 
                             s_step, max_s, folder)
        futures.append(future)

end = time.time()
if round(end-start, 3) < 60:
    print ("Complete - Elapsed time: {} s".format(round(end-start,3)))
else:
    print ("Complete - Elapsed time: {} mn and {} s".format(int((end-start)//60), round((end-start)%60,3)))

os.system("pause")

这仍然是我的代码的简化示例，但这个想法就在这里。 它运行时间不到0.2秒，这意味着他实际上从未调用过函数compute 。

注意： Obj不是实际的变量名。

Answer 1

要实现这一目标，您需要放弃使用map来获得对任务执行流程的更多控制。

此代码实现了您在问题结尾处描述的算法。 我建议使用concurrent.futures库，因为它公开了一组更整洁的API。

import concurrent.futures

pool = concurrent.futures.ProcessPoolExecutor(max_workers=6)

futures = []

for dyn, fix1, fix2 in dynamic_duos:
    if large_memory(dyn, fix1, fix2):
        concurrent.futures.wait(futures)  # wait for all pending tasks
        large_future = pool.submit(f, dyn, fix1, fix2)
        large_future.result()  # wait for large computation to finish
    else:
        future = pool.submit(f, dyn, fix1, fix2)
        futures.append(future)

Python根据可用RAM的数量和函数中的参数动态控制多处理脚本中的进程数

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-05-15 12:17:54

Python根据可用RAM的数量和函数中的参数动态控制多处理脚本中的进程数

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-05-15 12:17:54

解决方案1
1 已采纳 2018-05-15 12:17:54