[英]Append values in defaultdict(list) is not working when defined using multiprocessing manager
我正在使用多进程将任务划分为子进程。 我通过定制多处理管理器来注册defaultdict(list),创建了一个共享字典。 子流程使用附加值更新此词典,但是词典仅包含键,缺少相应的值。
from collections import defaultdict
from multiprocessing.managers import BaseManager, DictProxy
import multiprocessing as mp, io
class MyManager(BaseManager):
pass
def process_read(chunkStart, chunkSize, doc_cluster, f_path):
lock = mp.Lock()
with io.open(f_path) as f_handle:
f.seek(chunkStart)
lines = f.read(chunkSize).splitlines()
for line in lines:
f_name = line.rstrip()
key = some_processing()
lock.acquire()
try:
doc_cluster[key].append(f_name)
finally:
lock.release()
if __name__ == '__main__':
results = []
cores = mp.cpu_count()
pool = mp.Pool(cores)
fp = 'B:/FN/FN_test.txt'
MyManager.register('defaultdict', defaultdict, DictProxy)
mgr = MyManager()
mgr.start()
d_cluster = mgr.defaultdict(list)
for chunk_Start, chunk_Size in chunkify(fp):
results.append(pool.apply_async(process_read, args=(chunk_Start, chunk_Size, d_cluster, fp,)))
pool.close()
pool.join()
print d_cluster
实际输出:
{'cd':[],'ab':[],'bc':[]}
预期产量:
{'cd':[f_name1,f_name2,f_name3],'ab':[f_name5,f_name7],'bc':[f_name4,f_name6]}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.