简体   繁体   English

调用MPI_Recv挂起

[英]Call to MPI_Recv hangs

To keep it simple, sending up is sending to rank+1 and sending down is sending to rank-1 为简单起见,向上发送发送至等级+1,向下发送发送至等级1

The code is sending arrays from one node to another back and forth between them. 代码将数组从一个节点发送到另一个节点之间。 Here is the code: 这是代码:

MPI_Request req1, req2;
MPI_Status s1, s2;
if (procid+1 != nproc) {
    // send up
    MPI_Isend(prior_hot_plate[my_num_rows-2], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &req1);
    ++k;
    fprintf(stderr, "%d made it through Isend up\n", procid);
}
if (procid-1 != -1) {
    // send down
    MPI_Isend(prior_hot_plate[1], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &req2);
    ++k;
    fprintf(stderr, "%d made it past Isend down\n", procid);
}
if (procid+1 != nproc) {
    // recv up
    //MPI_Wait(&req1, &s1);
    //fprintf(stderr, "%d finished wait\n", procid);
    MPI_Recv(prior_hot_plate[my_num_rows-1], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &s1);
    ++k;
    fprintf(stderr, "%d finished receiving\n", procid);
}
if (procid-1 != -1) {
    // recv down
    //MPI_Wait(&req2, &s2);
    //fprintf(stderr, "%d finished wait\n", procid);
    MPI_Recv(prior_hot_plate[0], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &s2);
    ++k;
    fprintf(stderr, "%d finished receiving\n", procid);
}

Each of the nodes make it past the Isend calls no problem, but then all of them hang on the calls to Recv. 每个节点都可以顺利通过Isend调用,但是所有这些节点都挂在对Recv的调用上。 Does anyone see something wrong with this? 有人认为这有问题吗? What am I missing? 我想念什么?

Thanks 谢谢

When you make a call to MPI_Isend , the last parameter that you pass in (and get back) is an MPI_Request object. 当您对一个呼叫MPI_Isend ,您在通过(并取回)的最后一个参数是一个MPI_Request对象。 Your initial call to MPI_Isend doesn't (necessary) perform the send itself. 您最初对MPI_Isend调用不会(必要)自行执行发送。 It just informs MPI that you'd like to do the send operation sometime between now and when you complete that request. 它只是通知MPI您想从现在到完成请求之间的某个时间进行发送操作。 To signal that you'd like to complete the request, you need to make a matching call to a completion function (such as MPI_Wait or MPI_Test ). 为了表示您希望完成请求,您需要对完成函数(例如MPI_WaitMPI_Test )进行匹配的调用。

There are other questions on SO that cover this as well ( here for instance). SO上还有其他问题也涵盖了这一点(例如, 此处 )。

For your particular question, the right thing to do would be to convert all of your communication to non-blocking calls and then do a big MPI_Waitall at the bottom: 对于您的特定问题,正确的做法是将所有通信都转换为非阻塞呼叫,然后在底部进行大的MPI_Waitall

MPI_Request reqs[] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL, MPI_REQUEST_NULL, MPI_REQUEST_NULL};
if (procid+1 != nproc) {
    // send up
    MPI_Isend(prior_hot_plate[my_num_rows-2], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &reqs[0]);
}
if (procid-1 != -1) {
    // send down
    MPI_Isend(prior_hot_plate[1], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &reqs[1]);
}
if (procid+1 != nproc) {
    // recv up
    MPI_Irecv(prior_hot_plate[my_num_rows-1], TOTAL_COLS, MPI_FLOAT, procid+1, k, MPI_COMM_WORLD, &reqs[2]);
}
if (procid-1 != -1) {
    // recv down
    MPI_Irecv(prior_hot_plate[0], TOTAL_COLS, MPI_FLOAT, procid-1, k, MPI_COMM_WORLD, &reqs[3]);
}
++k;
MPI_Waitall(4, reqs, MPI_STATUSES_IGNORE);

Well I found the answer. 好吧,我找到了答案。 I tried Wesley's approach but I couldn't get it to work. 我尝试了韦斯利的方法,但无法成功。 It just kept segfaulting. 它只是一直存在段错误。 However his example led me to eventually change my code. 但是他的例子使我最终改变了代码。 In the original version I was incrementing k, the tag, after every call to send and recv. 在原始版本中,每次调用send和recv之后,我都将标记k递增。 As a result the recv calls were looking for a message with the wrong tag. 结果,recv呼叫正在寻找带有错误标签的消息。 By switching it to how Wesley has it - incrementing k at the very end - the problem was solved. 通过将其切换为Wesley的方式-在最后增加k-解决了问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM