My aim is to iterate through each element of a large 2D array ( data
) and do some heavy processing on each respective element. Therefore I want to use multiple MPIs to each take a portion of the array to work on. I am having the problem where I don't know how to exactly write the code to gather all the data together at the end. Here is some example code:
import numpy as np
import math
from mpi4py import MPI
M = 400
N = 300
data = np.random.rand(M,N)
result_a = np.zeros((M,N))
result_b = np.zeros((M,N))
def process_function(data):
a = data**2
b = data**0.5
return a,b
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
minimum = 0
maximum = int(M*N)
perrank = maximum//size
for index in range(minimum + rank*perrank, minimum + (rank+1)*perrank):
i = int(math.floor(index/N))
j = int(index % N)
a,b = process_function(data[i,j])
result_a[i,j] = a
result_b[i,j] = b
a_gath = comm.gather(result_a, root=0)
b_gath = comm.gather(result_b, root=0)
print(np.shape(a_gath))
print('---')
print(np.shape(b_gath))
Unfortunately, for my real problem, when I save both a_gath
and b_gath
to disk (as a pickle), they only contain a single occurrence of ()
(ie type None
) when I re-load them. Is there something else I should be doing before/after comm.gather
?
Here is my submission script:
#!/bin/bash -l
#$ -S /bin/bash
#$ -l h_rt=00:05:00
#$ -l mem=2G
#$ -l tmpfs=10G
#$ -pe mpi 5
#$ -N stack_test
#$ -notify
#$ -wd /home/user/Scratch/
module load gcc-libs
module load python3/recommended
module unload compilers mpi
module load compilers/gnu/4.9.2
module load mpi/openmpi/3.1.1/gnu-4.9.2
module load mpi4py
module list
python_infile=test.py
echo ""
echo "Running python < $python_infile ..."
echo ""
gerun python $python_infile
I submit this script with simply as qsub js_test.sh
The returned .o file for this fake example shows that, in this case 4/5 mpis contain type None
information: Would it be in this case, that if I then saved a_gath
and b_gath
to disk, it would save the last mpi? which is type None
? I would expect that after using comm.gather
, I would have a single array of size MxN
for variables a_gath
and b_gath
()
---
()
()
---
()
()
---
()
(5, 400, 300)
---
(5, 400, 300)
()
---
()
Many thanks.
To clarify the answer that you've found for yourself in the comments: MPI_Gather
is a rooted operation : its results are not identical across all ranks, and specifically differ on the rank provided in the root
argument.
In the case of Gather
, your finding that rank 0 is the one that ends up with the data is exactly correct for the way you've called it (with root=0
).
While in principle MPI supports multiple program, multiple data execution in which different ranks are running different code, in practice most MPI code is written in a single program, multiple data style, like what you've written. Because all ranks are running the same body of code, it's up to you to check after returning from a rooted operation like MPI_Gather
whether the rank that you're running on is the root and execute different code paths accordingly. If you don't, then every rank is going to be executing these lines:
print(np.shape(a_gath))
print('---')
print(np.shape(b_gath))
which, as you've noted, does not print the results you expected for a_gath
and b_gath
except on rank 0.
Try the following:
a_gath = comm.gather(result_a, root=0)
b_gath = comm.gather(result_b, root=0)
if rank == 0:
print(np.shape(a_gath))
print('---')
print(np.shape(b_gath))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.