MPI_Recv未收到所有MPI_Send请求

Question

I have a bug in my code. 我的代码中有错误。 I have multiple processes all processing data from a binary tree. 我有多个进程都在处理二叉树中的数据。 At the end, they should send the results to the master node (node 0) where the results will be processed. 最后，他们应将结果发送到将在其中处理结果的主节点（节点0）。 However, for some reason, some of the MPI_Sends are not being received. 但是，由于某些原因，某些MPI_Sends未被接收。

int *output=(int*) malloc(sizeof(int)*(varNum+2)); //contains all variable values and maxSAT and assignNum

if(proc_id!=0 && proc_id<nodeNums){
    output[0]=maxSAT;
    output[1]=assignNum;
    for(i=2;i<varNum+2;i++){
        output[i]=varValues[i-2];
    }
    MPI_Send(output,varNum+2,MPI_INT,0,TAG,MPI_COMM_WORLD);
    printf("proc %d sent data\n",proc_id);
}
else if(proc_id==0){
    for(i=1;i<nodeNums;i++){
        printf("receiving data from %d\n",i);
        MPI_Recv(output,varNum+2,MPI_INT,i,TAG,MPI_COMM_WORLD,MPI_STATUS_IGNORE);
        if(output[0]>maxSAT){
            maxSAT=output[0];
            assignNum=output[1];
            for(i=0;i<varNum;i++){
                varValues[i]=output[i+2];
            }   
        }
        else if(output[0]==maxSAT){
            assignNum+=output[1];
        }
    }
}

When I run it with 8 processes (nodeNums=8), this is the output. 当我用8个进程（nodeNums = 8）运行它时，这是输出。

proc 2 sent data
receiving data from 1
proc 5 sent data
proc 6 sent data
proc 3 sent data
proc 7 sent data
proc 1 sent data
proc 4 sent data

For some reason, all processes are sending data, but it is only receiving from 1. However, if I run it with 4 processes, everything is sent/received. 由于某些原因，所有进程都在发送数据，但仅从1接收数据。但是，如果我使用4个进程运行它，则一切都被发送/接收。 Anyone has any idea why this happens? 有人知道为什么会这样吗？

Answer 1

The problem has nothing to do with MPI. 该问题与MPI无关。 Your mistake is the use of the same variable in two different but nested loops: 您的错误是在两个不同但嵌套的循环中使用相同的变量：

else if(proc_id==0){
    for(i=1;i<nodeNums;i++){ <----------------- (1)
        ...
            for(i=0;i<varNum;i++){ <----------- (2)
                varValues[i]=output[i+2];
            }
        ...
    }
}

After the inner loop completes, the value of i is equal to varNum and if it happens that varNum is greater or equal to nodeNums , the outer loop terminates too. 内部循环完成后， i的值等于varNum ，如果碰巧varNum大于或等于nodeNums ，则外部循环也会终止。 Change the name of the loop variable of the inner loop. 更改内部循环的循环变量的名称。

Answer 2

This is not really the way to use MPI. 这不是使用MPI的真正方法。 What you want here is MPI_Gather() , which is where all the processes (including the root) send a chunk of data and the gathering process receives them all. 您想要的是MPI_Gather() ，在该处所有进程（包括根）都发送大块数据，而收集进程将全部接收。 Like this: 像这样：

rbuf = (int *)malloc(nodeNums*(varNum+2)*sizeof(int));
MPI_Gather(output, varNum+2, MPI_INT, rbuf, varNum+2, MPI_INT, 0, MPI_COMM_WORLD);

All your processes should execute the above in the same part of their execution. 您的所有进程都应在执行的同一部分中执行上述操作。 All the data will end up in rbuf . 所有数据将以rbuf 。

In your case, if the root doesn't want to send anything, just have it send empty data which it can simply ignore (after all, it doesn't need to physically "send" to itself, so this is not very inefficient). 在您的情况下，如果根目录不希望发送任何内容，只需让它发送可以被忽略的空数据即可（毕竟，它不需要物理地“发送”给自己，因此效率不是很高）。

MPI_Recv未收到所有MPI_Send请求

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-05-07 20:16:02

解决方案2
1 2016-05-07 17:03:36

MPI_Recv未收到所有MPI_Send请求

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-05-07 20:16:02

解决方案2 1 2016-05-07 17:03:36

解决方案1
2 已采纳 2016-05-07 20:16:02

解决方案2
1 2016-05-07 17:03:36