如何在HPC群集上通过python使用所有分配的节点

Question

I have a HPC cluster with SLURM installed. 我有一个安装了SLURM的HPC群集。 I can properly allocate nodes and cores for myself. 我可以为自己正确分配节点和核心。 I would like to be able to use all the allocated cores regardless of the node they are in. As i seen in this thread Using the multiprocessing module for cluster computing this cannot be achieved with multiprocessing . 我希望能够使用所有分配的核心，而不管它们位于哪个节点。正如我在此线程中看到的那样，使用多处理模块进行集群计算无法通过multiprocessing实现。

My script look like this (oversimplified version): 我的脚本如下所示（简化版）：

def func(input_data):
    #lots of computing
    return data

parallel_pool = multiprocessing.Pool(processes=300)
returned_data_list = []
for i in parallel_pool.imap_unordered(func, lots_of_input_data)
    returned_data_list.append(i)
# Do additional computing with the returned_data
....

This script works perfectly fine, however as i mentioned multiprocessing is not a good tool for me, as even if SLURM allocated 3 nodes for me, multiprocessing can only use one. 该脚本可以正常工作，但是由于我提到多处理对我来说不是一个好工具，即使SLURM为我分配了3个节点，多处理也只能使用一个。 As far as i understand this is a limitation of multiprocessing. 据我了解，这是多重处理的局限性。

I could use the srun protocol of SLURM, but that ust executes the same script N times, and i need additional computing with the output of the parallel processes. 我可以使用SLURM的srun协议，但是必须执行相同的脚本N次，并且我需要使用并行进程的输出进行额外的计算。 I could of course store the outputs somewhere, and ream em back in, but there must be some more elegant solution. 我当然可以将输出存储在某个地方，然后重新输入，但是必须有一些更优雅的解决方案。

In the mentioned thread there are suggestions like jug , but as i was reading through it i havet found a solution for myself. 在提到的主题中，有一些建议，例如jug ，但是当我阅读它时，我还没有为自己找到解决方案。

Maybe py4mpi can be a solution for me? 也许py4mpi对我来说可以解决方案？ The tutorials for that seems very messy, and i havent found a specific solution for my problem in there neither. 该教程似乎非常混乱，我也没有找到解决我的问题的具体解决方案。 (run a function in parallel with mpi, and then continue the script). （与mpi并行运行一个函数，然后继续执行脚本）。

I tried subprocess calls, but the seem to work the same way as multiprocess calls, so they only run on one node. 我尝试了subprocess调用，但似乎与multiprocess调用的工作方式相同，因此它们仅在一个节点上运行。 I havent found any confirmation of this, so this is only from my trial-and-error guess. 我尚未找到任何确认，因此这仅是基于我的反复试验猜测。

How can i overcome this problem? 我该如何克服这个问题？ Currently i could use more than 300 cores, but one node only have 32, so if i could find a solution then i would be able to run my project nearly 10 times as fast. 目前，我可以使用300多个核，但是一个节点只有32个核，因此，如果我能找到解决方案，那么我将能够以近10倍的速度运行我的项目。

Thanks 谢谢

Answer 1

很多麻烦的后scoop是解决我的问题库。

如何在HPC群集上通过python使用所有分配的节点

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-12-02 13:22:26

如何在HPC群集上通过python使用所有分配的节点

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-12-02 13:22:26

解决方案1
1 已采纳 2016-12-02 13:22:26