[英]Use multiprocessing for a for loop, Python
I have a for loop, which uses some binary conditions and finally writes a file accordingly. 我有一个for循环,它使用一些二进制条件,并最终相应地写入文件。 The problem I have is, the conditions are true for many files (sometimes around 1000 files need to be written).
我的问题是,许多文件的条件都是正确的(有时需要写入约1000个文件)。 So writing them takes a long time (around 10 mins).
因此,编写它们会花费很长时间(大约10分钟)。 I know I can somehow use Python's
multiprocessing
and utilise some of the cores. 我知道我可以以某种方式使用Python的
multiprocessing
并利用某些核心。
This is the code that works, but only uses one core. 这是有效的代码,但仅使用一个内核。
for i,n in enumerate(halo_param.strip()):
mask = var1['halo_id'] == n
newtbdata = tbdata1[mask]
hdu = pyfits.BinTableHDU(newtbdata)
hdu.writeto(('/home/Documments/file_{0}.fits').format(i))
I came across that it can be done using Pool
from multiprocessing
. 我发现可以使用来自
multiprocessing
Pool
来完成。
if __name__ == '__main__': pool = Pool(processes=4)
I would like to know how to do it and utilise atleast 4 of my cores. 我想知道如何做到这一点,并充分利用我的至少4个核心。
Restructure the for loop body as a function, and use Pool.map
with the function. 将for循环体重组为一个函数,并将
Pool.map
与该函数一起使用。
def work(arg):
i, n = arg
mask = var1['halo_id'] == n
newtbdata = tbdata1[mask]
hdu = pyfits.BinTableHDU(newtbdata)
hdu.writeto(('/home/Documments/file_{0}.fits').format(i))
if __name__ == '__main__':
pool = Pool(processes=4)
pool.map(work, enumerate(halo_param.strip()))
pool.close()
pool.join()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.