[英]Python multiprocessing shared list
我正在嘗試創建一個解析文件並將其轉換為大型列表的腳本,該列表應該在之后並行處理。 我已經嘗試了一些python的多處理實現,但它們似乎都按順序運行。
def grouper(n, iterable, padvalue=None):
"""grouper(3, 'abcdefg', 'x') -->
('a','b','c'), ('d','e','f'), ('g','x','x')"""
return izip_longest(*[iter(iterable)]*n, fillvalue=padvalue)
def createRecords(givenchunk):
for i1 in range(len(givenchunk)):
<create somedata>
records.append(somedata)
if __name__=='__main__':
manager = Manager()
parsedcdrs = manager.list([])
records = manager.list([])
<some general processing here which creates a shared list "parsedcdrs". Uses map to create a process "p" in some def which is terminated afterwards.>
# Get available cpus
cores = multiprocessing.cpu_count()
# First implementation with map with map.
t = multiprocessing.Pool(cores)
print "Map processing with chunks containing 5000"
t.map(createRecords, zip(parsedcdr), 5000)
# Second implementation with async.
t = multiprocessing.Pool(cores)
for chunk in grouper(5000, parsedcdr):
print "Async processing with chunks containing 5000"
t.apply_async(createRecords, args=(chunk,), callback=log_result)
t.close()
t.join()
# Third implementation with Process.
jobs = []
for chunk in grouper(5000, parsedcdr):
t = multiprocessing.Process(target=createRecords, args=(chunk,))
t.start()
jobs.append(t)
print "Process processing with chunks containing 5000"
for j in jobs:
j.join()
for j in jobs:
j.join()
有人能指出我正確的方向嗎?
多處理似乎在上面的例子中工作正常。 問題出在另一個def中,導致性能下降。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.