简体   繁体   中英

MPI4PY big arrays scattering produce deadlock

I am trying to scatter an array of size (3,512,512,48,2), with the data type of double precision np.float64 between 3 processes using Scatter() :

# mpirun -np 3 python3 prog.py
import numpy as np
from mpi4py import MPI

if __name__ == "__main__":
 nproc = comm.Get_size()
 rank = comm.Get_rank()  
 a = None

 a_split = np.empty([512,512,48,2],dtype = np.float64)

 if rank==0:

     a = np.zeros([3,512,512,48,2],dtype = np.float64)




 comm.Scatter([a, MPI.DOUBLE], a_split, root = 0)

However, program gets a deadlock. From what I have found from here

mpi4py scatter and gather with large numpy arrays

and here

Along what axis does mpi4py Scatterv function split a numpy array?

for big arrays I must use Scatterv() function. So, here is another code using this function:

# mpirun -np 3 python3 prog.py
import numpy as np
from mpi4py import MPI

if __name__ == "__main__":
    comm = MPI.COMM_WORLD
    nproc = comm.Get_size()
    rank = comm.Get_rank()  
    a = None

    a_split = np.empty([512,512,48,2],dtype = np.float64)

    size = 512*512*48*2 

    if rank==0:

        a = np.zeros([3,512,512,48,2],dtype = np.float64)




    comm.Scatterv([a,(size,size,size),(0,size,2*size),MPI.DOUBLE],a_split,root =0)

This, however, also leads to the deadlock. I have also tried to send arrays using point-to-point communication with Send() , Recv() but this doesn't help. It appears that deadlocking is depends only on the array size - for example, if I change size of the arrays from [512,512,48,2] to [512,10,48,2] , the code works.

Can anyone please suggest what I can do in this situation?

One issue is that you mix np.float and MPI.DOUBLE . A working script could be:

# mpirun -np 3 python3 prog.py
import numpy as np
from mpi4py import MPI
nproc = comm.Get_size()
rank = comm.Get_rank()  
a = None

a_split = np.empty([512,512,48,2],dtype = np.float)
a_split[:,:,:,:] = -666

if rank==0:
    a = np.zeros([3,512,512,48,2],dtype = np.float)

comm.Scatter(a, a_split, root = 0)

print(a_split[1,1,1,1], a_split[-1,-1,-1,-1])

I've added the last print line to show that -np 4 will work but not fill entirely a_split ; and -np 2 fails with a truncation error. My guess is that -np 3 was intended.

If your usage of np.float and MPI.DOUBLE was on purpose, please mention it in your question and add the -np you're using to launch the program.

[Edit] Here's also a C++ version of your script, so you can see if it is also deadlocking:

// mpic++ scat.cxx && mpirun -np <asmuchasyouwant> ./a.out

#include <iostream>
#include <vector>
#include <mpi.h>

int main(int argc, char** argv)
  MPI_Init(&argc, &argv);

  unsigned sz = 1*512*512*48*2;
  int rank, nbproc;
  std::vector<double> a;
  std::vector<double> a_split(sz);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &nbproc);

  if (rank == 0) {
    a.resize(nbproc * sz);
    std::fill(a.begin(), a.end(), 2.71);
  else {
    std::fill(a_split.begin(), a_split.end(), -666.666);

  MPI_Scatter(a.data(), sz, MPI_DOUBLE,
              a_split.data(), sz, MPI_DOUBLE,

  std::cout << rank << " done " << a_split[sz-1] << std::endl;


So, in the end, the solution was quite simple - I usually don't turn off my pc, and it seems like that's the reason why it produces deadlock after lots of computation. Simple reboot solved the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM