简体   繁体   English

python多处理锁定问题

[英]python multiprocessing lock issue

I want to add a list of dicts together with python multiprocessing module. 我想添加一个dicts列表和python多处理模块。

Here is a simplified version of my code: 这是我的代码的简化版本:

#!/usr/bin/python2.7
# -*- coding: utf-8 -*-

import multiprocessing
import functools
import time

def merge(lock, d1, d2):
    time.sleep(5) # some time consuming stuffs
    with lock:
        for key in d2.keys():
            if d1.has_key(key):
                d1[key] += d2[key]
            else:
                d1[key] = d2[key]

l = [{ x % 10 : x } for x in range(10000)]
lock = multiprocessing.Lock()
d = multiprocessing.Manager().dict()

partial_merge = functools.partial(merge, d1 = d, lock = lock)

pool_size = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes = pool_size)
pool.map(partial_merge, l)
pool.close()
pool.join()

print d
  1. I get this error when running this script. 运行此脚本时出现此错误。 How shall I resolve this? 我该如何解决这个问题?

    RuntimeError: Lock objects should only be shared between processes through inheritance

  2. is the lock in merge function needed in this condition? 在这种情况下需要lock merge功能吗? or python will take care of it? 或者python会照顾它吗?

  3. I think what's map supposed to do is to map something from one list to another list, not dump all things in one list to a single object. 我认为map应该做的是将某个列表中的某些内容映射到另一个列表,而不是将一个列表中的所有内容转储到单个对象中。 So is there a more elegant way to do such things? 那么有更优雅的方式来做这些事情吗?

The following should run cross-platform (ie on Windows, too) in both Python 2 and 3. It uses a process pool initializer to set the manager dict as a global in each child process. 以下内容应该在Python 2和3中跨平台运行(即在Windows上运行)。它使用进程池初始化程序将manager dict设置为每个子进程中的全局。

FYI: 供参考:

  • Using a lock is unnecessary with a manager dict. 经理字典不需要使用锁定。
  • The number of processes in a Pool defaults to the CPU count. Pool的进程数默认为CPU计数。
  • If you're not interested in the result, you can use apply_async instead of map . 如果您对结果不感兴趣,可以使用apply_async而不是map
import multiprocessing
import time

def merge(d2):
    time.sleep(1) # some time consuming stuffs
    for key in d2.keys():
        if key in d1:
            d1[key] += d2[key]
        else:
            d1[key] = d2[key]

def init(d):
    global d1
    d1 = d

if __name__ == '__main__':

    d1 = multiprocessing.Manager().dict()
    pool = multiprocessing.Pool(initializer=init, initargs=(d1, ))

    l = [{ x % 5 : x } for x in range(10)]

    for item in l:
        pool.apply_async(merge, (item,))

    pool.close()
    pool.join()

    print(l)
    print(d1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM