简体   繁体   English

导入使用MultiProcessing Python的模块

[英]Importing Modules that use MultiProcessing Python

I am looking to use the multiprocessing module to speed up the run time of some Transport Planning models. 我希望使用多处理模块来加快某些传输规划模型的运行时间。 I've optimized as much as I can via 'normal' methods but at the heart of it is an absurdly parallel problem. 我通过“正常”方法尽可能地优化,但其核心是一个荒谬的并行问题。 Eg Perform the same set of matrix operations four 4 different sets of inputs, all independent information. 例如,执行相同的矩阵运算集,4个不同的输入集,所有独立的信息。

Pseudo Code: 伪代码:

    for mat1,mat2,mat3,mat4 in zip([a1,a2,a3,a4],[b1,b2,b3,b4],[c1,c2,c3,c4],[d1,d2,d3,d4]):
        result1 = mat1*mat2^mat3
        result2 = mat1/mat4
        result3 = mat3.T*mat2.T+mat4

So all I really want to do is process the iterations of this loop in parallel on a quad core computer. 所以我真正想做的就是在四核计算机上并行处理这个循环的迭代。 I've read up here and other places on the multiprocessing module and it seems to fit the bill perfectly except for the required: 我已经在这里以及多处理模块上的其他地方阅读了它,除了要求之外它似乎完全符合要求:

   if __name__ == '__main__'

From what I understand this means that you can only multiprocess code run from a script? 根据我的理解,这意味着您只能从脚本运行多处理代码? ie if I do something like: 即如果我做了类似的事情:

    import multiprocessing
    from numpy.random import randn

    a = randn(100,100)
    b = randn(100,100)
    c = randn(100,100)
    d = randn(100,100)

    def process_matrix(mat):
        return mat^2

    if __name__=='__main__':
        print "Multiprocessing"
        jobs=[]

        for input_matrix in [a,b,c,d]:
            p = multiprocessing.Process(target=process_matrix,args=(input_matrix,))
            jobs.append(p)
            p.start()

It runs fine, however assuming I saved the above as 'matrix_multiproc.py', and defined a new file 'importing_test.py' which just states: 它运行正常,但假设我将上面保存为'matrix_multiproc.py',并定义了一个新文件'imports_test.py',它只是声明:

    import matrix_multiproc

The multiprocessing does not happen because the name is now 'matrix_multiproc' and not ' main ' 多处理不会发生,因为名称现在是'matrix_multiproc'而不是' main '

Does this mean I can never use parallel processing on an imported module? 这是否意味着我永远不能在导入的模块上使用并行处理? All I am trying to do is have my model run as: 我所要做的就是将我的模型运行为:

    def Model_Run():
        import Part1, Part2, Part3, matrix_multiproc, Part4

        Part1.Run()
        Part2.Run()
        Part3.Run()
        matrix_multiproc.Run()
        Part4.Run()

Sorry for a really long question to what is probably a simple answer, thanks! 很抱歉,这可能只是一个简单的答案,非常长的问题,谢谢!

Does this mean I can never use parallel processing on an imported module? 这是否意味着我永远不能在导入的模块上使用并行处理?

No, it doesn't. 不,它没有。 You can use multiprocessing anywhere in your code, provided that the program's main module uses the if __name__ == '__main__' guard. 您可以在代码中的任何位置使用multiprocessing前提是程序的主模块使用if __name__ == '__main__'保护。

On Unix systems, you won't even need that guard, since it features the fork() system call to create child processes from the main python process. 在Unix系统上,你甚至不需要那个防护,因为它具有fork()系统调用来从主python进程创建子进程。

On Windows, on the other hand, fork() is emulated by multiprocessing by spawning a new process that runs the main module again , using a different __name__ . 另一方面,在Windows上, fork()通过多次multiprocessing来模拟,通过使用不同的__name__ 再次运行主模块的新进程。 Without the guard here, your main application will try to spawn new processes again, resulting in an endless loop, and eating up all your computer's memory pretty fast. 如果没有这里的警卫,你的主应用程序将尝试再次产生新的进程,导致无限循环,并且非常快地耗尽所有计算机的内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM