在MPI_Send / MPI_Recv对中，如果数据未正确同步，是否会丢失数据？

Question

Let me explain. 让我解释。 Consider 4 slave nodes 1, 2, 3, 4 and a master node 0. Now, 1, 2, 3, 4, need to send data to 0. 0 receives this data in the following format. 考虑4个从节点1、2、3、4和一个主节点0。现在，1、2、3、4需要将数据发送到0。0以以下格式接收此数据。

for(int proc = 1;proc<procCount;proc++) // for each processor cpu (procCount = 5)
{
    for(int p = 0;p<50;p++)
    {

    std::cout<<proc<<"\tA\t"<<p<<std::endl;

    // read in binary datas
   int chunkP;
   int realP;
   real fitnessVal;
   real fitnessValB;
   real fitnessValC;
   int conCount;
   real subConCount;
   real networkEnergyLoss;
   real movementEnergyLoss;
   long spikeCount;

   MPI_Recv (reinterpret_cast < char *>(&chunkP),
      sizeof (chunkP),
                     MPI_CHAR,proc,MPI_ANY_TAG,MPI_COMM_WORLD,&stat);
   MPI_Recv (reinterpret_cast < char *>(&realP),
      sizeof (realP),
                        .
                        .
                        .
           }
     }

Clearly, the order in which 1, 2, 3 and 4 send the data to 0 cannot be assumed (since they are all operating independently of each other -- 2 might send data before 1). 显然，不能假设1、2、3和4将数据发送到0的顺序（因为它们都彼此独立运行-2可能在1之前发送数据）。 So assuming 2 does send its data before 1 (for example), the receiving loop in 0 shown above won't initiate until the source tag 'proc' in the MPI_Recv command is matched to the processor '1' because the outer for loop forces this ordering. 因此，假设2确实在1之前发送数据（例如），则直到MPI_Recv命令中的源标签proc与处理器1匹配之前，上述0中的接收循环才会启动，因为外部for循环强制这个命令。

So what happens is the loop 'waits' until there is data incoming from 1 before it can do anything else even if there is already data arriving from 2, 3 and 4. What happens to this data arriving from 2,3 and 4 if it arrives before 1? 因此，发生的事情是循环“等待”直到有数据从1传入，然后它才能执行其他任何操作，即使已经有来自2、3和4的数据到达了也是如此。在1之前到达？ Can it be 'forgotten' in the sense that once data from '1' does start arriving and then proc increments to 2, the data that it originally tried to receive from 2 is simply not there any more? 从“ 1”开始数据到达然后proc递增到2之后，它最初试图从2接收的数据就不再存在了，就可以说它被“遗忘”了吗？ If it is 'forgotten', the whole distributed simulation will just hang, because it never ends up being able to process the data of a particular slave process correctly. 如果它被“遗忘”，则整个分布式仿真将被挂起，因为它永远无法正确处理特定从属进程的数据。

Thanks, Ben. 谢谢，本

Answer 1

Firstly, do you really mean to receive an MPI_CHAR into chunkP - an int - shouldn't you receive an MPI_INT ? 首先，你真的要收到MPI_CHAR到chunkP -一个int -你不应该收到MPI_INT ？

The messages from ranks 1:4 will not get lost - they will get queued until rank 0 chooses to receive them. 等级1：4的消息不会丢失-直到等级0选择接收它们时，它们才会排队。 This behaviour is mandated by the MPI standard. MPI标准规定了此行为。

If the messages are large enough, ranks 1:4 may block until they can actually send their messages to rank 0 (most MPI implementations have limited buffering). 如果消息足够大，则等级1：4可能会阻塞，直到它们可以将其消息实际发送到等级0（大多数MPI实现具有有限的缓冲）为止。

You might also consider having rank 0 do an MPI_ANY_SOURCE receive for the first receive to see who's ready to send. 您可能还考虑让等级0做MPI_ANY_SOURCE接收的第一个MPI_ANY_SOURCE ，以查看谁准备发送。 You'll need to take care though to ensure that subsequent receives are posted for the corresponding source - look in the MPI_Status struct to see where the message was actually sent from. 不过，您需要注意确保为相应的源发布了后续的接收-查看MPI_Status结构以查看实际发送消息的位置。

在MPI_Send / MPI_Recv对中，如果数据未正确同步，是否会丢失数据？

问题描述

1 个解决方案

解决方案1
3 已采纳 2010-11-24 12:19:53

在MPI_Send / MPI_Recv对中，如果数据未正确同步，是否会丢失数据？

问题描述

1 个解决方案

解决方案1 3 已采纳 2010-11-24 12:19:53

解决方案1
3 已采纳 2010-11-24 12:19:53