简体   繁体   English

在Python中进行多处理后的后处理结果

[英]Post-processing results after multi-processing in Python

So I have a simple MP code and it works like a charm. 因此,我有一个简单的MP代码,它的工作原理很吸引人。 However, when I do a very simple post processing on the data generated via MP, the code does not work anymore. 但是,当我对通过MP生成的数据进行非常简单的后处理时,该代码将不再起作用。 It never stops and runs like forever! 它永远不会停止并且永远运行! This is the code (and again it works perfectly): 这是代码(再次完美地工作):

import numpy as np
from multiprocessing import Pool

n = 4
nMCS = 10**5

def my_function(j):
    result = []
    for j in range(nMCS // n):
        a = np.random.rand(10,2)
        result.append(a) 
    return result

if __name__ == '__main__':
    __spec__ = "ModuleSpec(name='builtins', loader=<class '_frozen_importlib.BuiltinImporter'>)" # this is because I am using Spyder!

    pool = Pool(processes = n) 

    data = pool.map(my_function, [i for i in range(n)])

    pool.close()
    pool.join()

#final_result = np.concatenate(data)   ### this is what ruins my code! ###

Meanwhile, if I add final_result = np.concatenate(data) at the end, it never works! 同时,如果我在最后添加final_result = np.concatenate(data) ,它将永远无法工作! I am using Spyder and if I simply type final_result = np.concatenate(data) in the console AFTER MP is done, it gives me what I want ie a concatenated list. 我正在使用Spyder ,如果我在MP完成后在控制台中简单地输入final_result = np.concatenate(data) ,它会给我我想要的东西,即串联列表。 However, if I put that simple line in the main program at the very end, it just doesn't work. 但是,如果我最后在主程序中放了那条简单的线,那就行不通了。 Could anyone tell me how to fix this? 谁能告诉我该如何解决?

PS this is a very simple example I generated so you can understand what is going on; PS:这是我生成的一个非常简单的示例,因此您可以了解发生了什么。 my real problem is way more complicated and there is no way I can do post processing after I am done with MP. 我真正的问题是方法更加复杂,用MP完成后我无法进行后期处理。

Your problem is that when you run np.concatenate , it's not done in the main function. 您的问题是,当您运行np.concatenate ,它没有在main函数中完成。 I suspect that the problem you're encountering is Spyder specific, but updating the indentation should fix it. 我怀疑您遇到的问题是Spyder特有的,但是更新缩进应该可以解决它。

As @Ares already implied, you fix the problem by indenting everything south the if __name__ == "__main__" -statement into the if-block. 正如已经暗示了@Ares一样,您可以通过将if __name__ == "__main__" -statement南部的所有if __name__ == "__main__"缩进if __name__ == "__main__"解决此问题。

FYI, this happens on Windows which doesn't provide forking for starting up new processes like Unix-y systems, but uses 'spawn' as default (and only) start-method. 仅供参考,这发生在Windows上,它不提供启动Unix-y系统等新进程的功能,而是使用“ spawn”作为默认(唯一)启动方法。 Spawn means, the OS has to boot a new process with an interpreter from scratch for every worker-process. Spawn意味着,操作系统必须为每个工作进程从头开始使用解释器启动新进程。

Your worker-processes will need to import your target function my_function . 您的工作进程将需要导入目标函数my_function When this happens, everything not protected within the if __name__ == "__main__": -block will also run in every child-process on import. 发生这种情况时, if __name__ == "__main__": - if __name__ == "__main__":不受保护的所有内容也将在导入时的每个子进程中运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM