简体   繁体   中英

Call to MPI_Recv hangs

To keep it simple, sending up is sending to rank+1 and sending down is sending to rank-1

The code is sending arrays from one node to another back and forth between them. Here is the code:

MPI_Request req1, req2;
MPI_Status s1, s2;
if (procid+1 != nproc) {
    // send up
    MPI_Isend(prior_hot_plate[my_num_rows-2], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &req1);
    ++k;
    fprintf(stderr, "%d made it through Isend up\n", procid);
}
if (procid-1 != -1) {
    // send down
    MPI_Isend(prior_hot_plate[1], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &req2);
    ++k;
    fprintf(stderr, "%d made it past Isend down\n", procid);
}
if (procid+1 != nproc) {
    // recv up
    //MPI_Wait(&req1, &s1);
    //fprintf(stderr, "%d finished wait\n", procid);
    MPI_Recv(prior_hot_plate[my_num_rows-1], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &s1);
    ++k;
    fprintf(stderr, "%d finished receiving\n", procid);
}
if (procid-1 != -1) {
    // recv down
    //MPI_Wait(&req2, &s2);
    //fprintf(stderr, "%d finished wait\n", procid);
    MPI_Recv(prior_hot_plate[0], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &s2);
    ++k;
    fprintf(stderr, "%d finished receiving\n", procid);
}

Each of the nodes make it past the Isend calls no problem, but then all of them hang on the calls to Recv. Does anyone see something wrong with this? What am I missing?

Thanks

When you make a call to MPI_Isend , the last parameter that you pass in (and get back) is an MPI_Request object. Your initial call to MPI_Isend doesn't (necessary) perform the send itself. It just informs MPI that you'd like to do the send operation sometime between now and when you complete that request. To signal that you'd like to complete the request, you need to make a matching call to a completion function (such as MPI_Wait or MPI_Test ).

There are other questions on SO that cover this as well ( here for instance).

For your particular question, the right thing to do would be to convert all of your communication to non-blocking calls and then do a big MPI_Waitall at the bottom:

MPI_Request reqs[] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL, MPI_REQUEST_NULL, MPI_REQUEST_NULL};
if (procid+1 != nproc) {
    // send up
    MPI_Isend(prior_hot_plate[my_num_rows-2], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &reqs[0]);
}
if (procid-1 != -1) {
    // send down
    MPI_Isend(prior_hot_plate[1], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &reqs[1]);
}
if (procid+1 != nproc) {
    // recv up
    MPI_Irecv(prior_hot_plate[my_num_rows-1], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &reqs[2]);
}
if (procid-1 != -1) {
    // recv down
    MPI_Irecv(prior_hot_plate[0], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &reqs[3]);
}
++k;
MPI_Waitall(4, reqs, MPI_STATUSES_IGNORE);

Well I found the answer. I tried Wesley's approach but I couldn't get it to work. It just kept segfaulting. However his example led me to eventually change my code. In the original version I was incrementing k, the tag, after every call to send and recv. As a result the recv calls were looking for a message with the wrong tag. By switching it to how Wesley has it - incrementing k at the very end - the problem was solved.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM