将多处理用于for循环，Python

Question

I have a for loop, which uses some binary conditions and finally writes a file accordingly. 我有一个for循环，它使用一些二进制条件，并最终相应地写入文件。 The problem I have is, the conditions are true for many files (sometimes around 1000 files need to be written). 我的问题是，许多文件的条件都是正确的（有时需要写入约1000个文件）。 So writing them takes a long time (around 10 mins). 因此，编写它们会花费很长时间（大约10分钟）。 I know I can somehow use Python's multiprocessing and utilise some of the cores. 我知道我可以以某种方式使用Python的multiprocessing并利用某些核心。

This is the code that works, but only uses one core. 这是有效的代码，但仅使用一个内核。

for i,n in enumerate(halo_param.strip()):
    mask = var1['halo_id'] == n
    newtbdata = tbdata1[mask]
    hdu = pyfits.BinTableHDU(newtbdata)
    hdu.writeto(('/home/Documments/file_{0}.fits').format(i))

I came across that it can be done using Pool from multiprocessing . 我发现可以使用来自multiprocessing Pool来完成。

if __name__ == '__main__': pool = Pool(processes=4)

I would like to know how to do it and utilise atleast 4 of my cores. 我想知道如何做到这一点，并充分利用我的至少4个核心。

Answer 1

Restructure the for loop body as a function, and use Pool.map with the function. 将for循环体重组为一个函数，并将Pool.map与该函数一起使用。

def work(arg):
    i, n = arg
    mask = var1['halo_id'] == n
    newtbdata = tbdata1[mask]
    hdu = pyfits.BinTableHDU(newtbdata)
    hdu.writeto(('/home/Documments/file_{0}.fits').format(i))

if __name__ == '__main__':
    pool = Pool(processes=4)
    pool.map(work, enumerate(halo_param.strip()))
    pool.close()
    pool.join()

将多处理用于for循环，Python

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-09-13 19:42:44

将多处理用于for循环，Python

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-09-13 19:42:44

解决方案1
1 已采纳 2014-09-13 19:42:44