繁体   English   中英

对于访问全局列表变量的Pool worker,使用锁或管理器列表进行Python多处理

[英]Python multiprocessing using a lock or manager list for Pool workers accessing a global list variable

我试图在多个CUDA设备上分配作业,其中任何时候运行的作业总数应小于或等于可用的cpu核心数。 为此,我确定每个设备上可用“插槽”的数量,并创建一个包含可用插槽的列表。 如果我有6个cpu核心和两个cuda设备(0和1),那么AVAILABLE_SLOTS = [0,1,0,1,0,1]。 在我的worker函数中,我弹出列表并将其保存到变量中,在子进程调用中设置CUDA_VISIBLE_DEVICES env var,然后将其追加到列表中。 到目前为止,这一直有效,但我想避免竞争条件。

目前的代码如下:

def work(cmd):
    slot = AVAILABLE_GPU_SLOTS.pop()
    exit_code = subprocess.call(cmd, shell=False, env=dict(os.environ, CUDA_VISIBLE_DEVICES=str(slot)))
    AVAILABLE_GPU_SLOTS.append(slot)
    return exit_code

if __name__ == '__main__':
    pool_size = multiprocessing.cpu_count()
    mols_to_be_run = [name for name in os.listdir(YANK_FILES) if os.path.isdir(os.path.join(YANK_FILES, name))]
    cmds = build_cmd(mols_to_be_run)
    cuda = get_cuda_devices()
    AVAILABLE_GPU_SLOTS = build_available_gpu_slots(pool_size, cuda)
    pool = multiprocessing.Pool(processes=pool_size, maxtasksperchild=2, )
    pool.map(work, cmds)

我可以简单地声明lock = multiprocessing.Lock()与AVAILABLE_GPU_SLOTS处于同一级别,将其放在cmds中,然后在work()内部

with lock:
    slot = AVAILABLE_GPU_SLOTS.pop()
# subprocess stuff
with lock:
    AVAILABLE_GPU_SLOTS.append(slot)

还是我需要一个经理名单。 或者也许对我正在做的事情有一个更好的解决方案。

基于我在下面的SO回答中找到的答案Python在进程之间共享锁定

正如预期的那样,使用常规列表会导致每个进程都有自己的副本。 使用经理列表似乎足以解决这个问题。 示例代码:

def doing_work(honk):
    proc = multiprocessing.current_process()
    # with lock:
    #     print proc, 'about to pop SLOTS_LIST', SLOTS_LIST
    #     slot = SLOTS_LIST.pop()
    #     print multiprocessing.current_process(), ' just popped', slot, 'from', SLOTS_LIST
    print proc, 'about to pop SLOTS_LIST', SLOTS_LIST
    slot = SLOTS_LIST.pop()
    print multiprocessing.current_process(), ' just popped', slot, 'from SLOTS_LIST'
    time.sleep(10)

def init(l):
    global lock
    lock = l

if __name__ == '__main__':
    man = multiprocessing.Manager()
    SLOTS_LIST = [1,34,3465,456,4675,6,4]
    SLOTS_LIST = man.list(SLOTS_LIST)
    l = multiprocessing.Lock()
    pool = multiprocessing.Pool(processes=2, initializer=init, initargs=(l,))
    inputs = range(len(SLOTS_LIST))
    pool.map(doing_work, inputs)

哪个输出

<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456, 4675, 6, 4]
<Process(PoolWorker-3, started daemon)>  just popped 4 from SLOTS_LIST
<Process(PoolWorker-2, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456, 4675, 6]
<Process(PoolWorker-2, started daemon)>  just popped 6 from SLOTS_LIST
<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456, 4675]
<Process(PoolWorker-3, started daemon)>  just popped 4675 from SLOTS_LIST
<Process(PoolWorker-2, started daemon)> about to pop SLOTS_LIST [1, 34, 3465, 456]
<Process(PoolWorker-2, started daemon)>  just popped 456 from SLOTS_LIST
<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1, 34, 3465]    
<Process(PoolWorker-3, started daemon)>  just popped 3465 from SLOTS_LIST
<Process(PoolWorker-2, started daemon)> about to pop SLOTS_LIST [1, 34]
<Process(PoolWorker-2, started daemon)>  just popped 34 from SLOTS_LIST
<Process(PoolWorker-3, started daemon)> about to pop SLOTS_LIST [1]
<Process(PoolWorker-3, started daemon)>  just popped 1 from SLOTS_LIST

这是理想的行为。 我不确定它是否完全消除了竞争条件,但它似乎已经足够好了。 那并且在它上面使用锁是很简单的。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM