简体   繁体   English

MPI_Irecv没有收到所有发送?

[英]MPI_Irecv does not receive all sends?

What I am trying to acheive in this simplified code is: 我试图在这个简化的代码中实现的是:

  • 2 types of processes (root, and children, ids/rank = 10 and 0-9 respectively) 2种类型的进程(root和child,id / rank = 10和0-9)
  • init: 在里面:
    • root will listen to children "completed" root会听孩子“完成”
    • children will listen to root notification when all has completed 完成所有操作后,孩子们将收听根通知
  • while there is no winner (not all done yet): 虽然没有赢家(还没有完成):
    • children will have 20% chance they will be done (and notify root they are done) 孩子将有20%的机会完成(并通知他们已完成)
    • root will check that all are done root会检查一切都完成了
      • if all done: send notification to children of "winner" 如果全部完成:向“胜利者”的孩子发送通知

I have code like: 我的代码如下:

int numprocs, id, arr[10], winner = -1;
bool stop = false;
MPI_Request reqs[10], winnerNotification;

MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &id);

for (int half = 0; half < 1; half++) {
    for (int round = 0; round < 1; round++) {
        if (id == 10) { // root
            // keeps track of who has "completed"
            fill_n(arr, 10, -1);
            for (int i = 0; i < 10; i++) {
                MPI_Irecv(&arr[i], 1, MPI_INT, i, 0, MPI_COMM_WORLD, &reqs[i]);
            }
        } else if (id < 10) { // children
            // listen to root of winner notification/indication to stop
            MPI_Irecv(&winner, 1, MPI_INT, 10, 1, MPI_COMM_WORLD, &winnerNotification);
        }

        while (winner == -1) {
            //cout << id << " is in loop" << endl;

            if (id < 10 && !stop && ((rand() % 10) + 1) < 3) { 
                // children has 20% chance to stop (finish work)
                MPI_Send(&id, 1, MPI_INT, 10, 0, MPI_COMM_WORLD);
                cout << id << " sending to root" << endl;
                stop = true;
            } else if (id == 10) {
                // root checks number of children completed
                int numDone = 0;
                for (int i = 0; i < 10; i++) {
                    if (arr[i] >= 0) {
                        //cout << "root knows that " << i << " has completed" << endl;
                        numDone++;
                    }
                }
                cout << "numDone = " << numDone << endl;

                // if all done, send notification to players to stop
                if (numDone == 10) {
                    winner = 1;
                    for (int i = 0; i < 10; i++) {
                        MPI_Send(&winner, 1, MPI_INT, i, 1, MPI_COMM_WORLD);
                    }
                    cout << "root sent notification of winner" << endl;
                }
            }
        }
    }
}

MPI_Finalize();

Output from debugging cout s look like: problem seems to be root is not receiving all childrens notification that they are completed? 从调试cout的输出看起来像:问题似乎是root没有收到所有孩子的通知,他们已经完成了?

2 sending to root
3 sending to root
0 sending to root
4 sending to root
1 sending to root
8 sending to root
9 sending to root
numDone = 1
numDone = 1
... // many numDone = 1, but why 1 only?
7 sending to root
...

I thought perhaps I can't receive into an array: but I tried 我想也许我不能收到阵列:但我试过了

if (id == 1) {
    int x = 60;
    MPI_Send(&x, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
} else if (id == 0) {
    MPI_Recv(&arr[1], 1, MPI_INT, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    cout << id << " recieved " << arr[1] << endl;
}

Which works. 哪个有效。

UPDATE UPDATE

This seems to be resolved if I add a MPI_Barrier(MPI_COMM_WORLD) before the end of the while loop, but why? 如果我在while循环结束之前添加MPI_Barrier(MPI_COMM_WORLD) ,这似乎得到了解决,但为什么呢? Even if the processes run out of sync, eventually, children will send to root that they have completed and root should "listen" to that and process accordingly? 即使进程不同步,最终,子进程也会向root发送已完成的内容,root应该“监听”并相应地进行处理? What seems to be happening is root keeps running, hogging up all resources for children to execute at all? 似乎正在发生的事情是root持续运行,占用了所有资源供孩子们执行? Or whats happening here? 或者这里发生了什么?

UPDATE 2: some children not getting notification from root 更新2:一些孩子没有从root获得通知

Ok now the problem that root does not receive children's notification that they have completed by @MichaelSh's answer, I focus on children not receiving from parent. 好吧,现在根本没有收到儿童通知的问题,他们已经通过@MichaelSh的答案完成了,我专注于没有从父母那里收到的孩子。 Here's a code that reproduces that problem: 这是一个重现该问题的代码:

int numprocs, id, arr[10], winner = -1;
bool stop = false;
MPI_Request reqs[10], winnerNotification;

MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &id);

srand(time(NULL) + id);

if (id < 10) {
    MPI_Irecv(&winner, 1, MPI_INT, 10, 0, MPI_COMM_WORLD, &winnerNotification);
}
MPI_Barrier(MPI_COMM_WORLD);

while (winner == -1) {
    cout << id << " is in loop ..." << endl;
    if (id == 10) {
        if (((rand() % 10) + 1) < 2) {
            winner = 2;
            for (int i = 0; i < 10; i++) {
                MPI_Send(&winner, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
            }
            cout << "winner notifications sent" << endl;
        }
    }
}

cout << id << " b4 MPI_Finalize. winner is " << winner << endl;

MPI_Finalize();

Output looks like: 输出如下:

# 1 run
winner notifications sent
10 b4 MPI_Finalize. winner is 2
9 b4 MPI_Finalize. winner is 2
0 b4 MPI_Finalize. winner is 2

# another run
winner notifications sent
10 b4 MPI_Finalize. winner is 2
8 b4 MPI_Finalize. winner is 2

Notice some processes doesnt seem to get the notification from the parent? 请注意,某些进程似乎无法从父级获取通知? Why is that, MPI_Wait for child processes will just hang them? 为什么这样,子进程的MPI_Wait会挂起它们? So how do I resolve this? 那么我该如何解决这个问题呢?

Also

All MPI_Barrier does in your case -- it waits for child responses to complete. 所有MPI_Barrier都在您的情况下 - 它等待孩子的答复完成。 Please check my answer for a better solution 请检查我的答案以获得更好的解决方案

If I dont do this, I suppose each child response will just take few ms? 如果我不这样做,我想每个孩子的反应只需几毫秒? So even if I dont wait/barrier, I'd expect the receive to still happen soon after the send? 所以,即使我不等待/屏障,我也希望收到发送后很快就能收到? Unless processes end up hogging resources and other processes does not run? 除非进程最终占用资源而其他进程不运行?

Please try this block of code (error checking omitted for simplicity): 请尝试这段代码(为简单起见,省略了错误检查):

...
// root checks number of children completed
int numDone = 0;
MPI_Status statuses[10];
MPI_Waitall(10, reqs, statuses);
for (int i = 0; i < 10; i++) {
...

Edit A better solution: 编辑更好的解决方案:
Each child initiates root winner notification receipt and sends its notification to the root. 每个孩子都会发起根获胜者通知收据并将其通知发送给根。
Root initiates winner notification receipt to the array and goes into wait for all notifications to be received, and then sends winner's id to children. Root向阵列发起获胜者通知收据并等待接收所有通知,然后将获胜者的ID发送给孩子。 Insert this code below after for (int round = 0; round < 1; round++) for (int round = 0; round < 1; round++)之后插入此代码for (int round = 0; round < 1; round++)

            if (id == 10) 
            { // root
                // keeps track of who has "completed"
                memset(arr, -1, sizeof(arr));
                for (int i = 0; i < 10; i++) 
                {
                    MPI_Irecv(&arr[i], 1, MPI_INT, i, 0, MPI_COMM_WORLD, &reqs[i]);
                }
            } 
            else if (id < 10) 
            { // children
                // listen to root of winner notification/indication to stop
                MPI_Irecv(&winner, 1, MPI_INT, 10, 1, MPI_COMM_WORLD, &winnerNotification);
            }

            if (id < 10)
            {
                while(((rand() % 10) + 1) < 3) ;

                // children has 20% chance to stop (finish work)
                MPI_Send(&id, 1, MPI_INT, 10, 0, MPI_COMM_WORLD);
                std::cout << id << " sending to root" << std::endl;
                // receive winner notification
                MPI_Status status;
                MPI_Wait(&winnerNotification, &status);
                // Process winner notification
            } 
            else if (id == 10) 
            {
                MPI_Status statuses[10];
                MPI_Waitall(10, reqs, statuses);                    

                // if all done, send notification to players to stop
                {
                    winner = 1;
                    for (int i = 0; i < 10; i++) 
                    {
                        MPI_Send(&winner, 1, MPI_INT, i, 1, MPI_COMM_WORLD);
                    }
                    std::cout << "root sent notification of winner" << std::endl;
                }
            }                            

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM