如何在python中并行化一个简单的循环？

Question

I have a loop that crashes my RAM every time, and I would like the parallelize. 我有一个循环，每次都崩溃我的RAM，我想并行化。

I tried this code, but donesn't work: 我尝试了这段代码，但是没有用：

from joblib import Parallel, delayed

from Bio.Align.Applications import ClustalOmegaCommandline


def run(test):
    im = process_image(Image.open(test['Path'][i]))
    test_images.append(im)


if __name__ == "__main__":
    test_images = []
    test = range(len(test))

    Parallel(n_jobs=len(test)(
        delayed(run)(i) for i in len(test))

I got this error: 我收到了这个错误：

File "", line 16 delayed(run)(i) for i in len(test)) ^ SyntaxError: unexpected EOF while parsing 文件“”，第16行延迟（运行）（i）for i in len（test））^ SyntaxError：解析时意外的EOF

My loop: 我的循环：

test_images = []
for i in range(len(test)):
  im = process_image(Image.open(test['Path'][i]))
  test_images.append(im)
test_images = np.asarray(test_images)

I have tried several solutions, but I need a single database output. 我尝试了几种解决方案，但我需要一个数据库输出。

Answer 1

Can you try the following: 你能尝试以下方法吗？

def process_image(img_path):
    img_obj = Image.open(img_path)
    # your logic here
    return im

def main():
    image_dict = {}
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for img_path, im in zip(test['Path'], executor.map(process_image, test['Path'])):
            image_dict[img_path] = im
    return image_dict

if __name__ == '__main__':
    image_dict = main()
    test_images = np.asarray(image_dict.values())

Answer 2

I am not sure, if parallelization is the answer to memory problems. 我不确定，并行化是否是内存问题的答案。

Do you need to store every image inside a list, which is stored in memory? 您是否需要将每个图像存储在列表中，该列表存储在内存中？ Maybe just save the path and load it, when it is needed? 也许只需保存路径并在需要时加载它？

Or try out generators . 或试试发电机。 There the values are generated lazy (only if they are needed), which results in fewer memory consumption. 这些值是惰性生成的（只有在需要时才会生成），从而减少内存消耗。

如何在python中并行化一个简单的循环？

问题描述

2 个解决方案

解决方案1
0 2019-04-03 11:34:20

解决方案2
0 2019-04-03 11:46:30

如何在python中并行化一个简单的循环？

问题描述

2 个解决方案

解决方案1 0 2019-04-03 11:34:20

解决方案2 0 2019-04-03 11:46:30

解决方案1
0 2019-04-03 11:34:20

解决方案2
0 2019-04-03 11:46:30