简体   繁体   English

MPI集体操作和流程生存期(C / C ++)

[英]MPI collective operations and process lifetime (C/C++)

For the problem I'd like to discuss, let's take MPI_Barrier as an example. 对于我想讨论的问题,让我们以MPI_Barrier为例。 The MPI3 standard states MPI3标准规定

If comm is an intracommunicator, MPI_BARRIER blocks the caller until all group members have called it. 如果comm是内部通信者,则MPI_BARRIER阻止调用者,直到所有组成员都调用它为止。 The call returns at any process only after all group members have entered the call. 仅在所有组成员都进入呼叫后,呼叫才能以任何过程返回。

So I was wondering - same essentially applies to all collective operations in general - how this assertion has to be interpreted in cases where some processes of the communication context just exited (successfully) prior to execution of MPI_Barrier : For example, let's assume we have two processes A and B and use MPI_COMM_WORLD as communicator and argument comm to MPI_Barrier . 因此,我想知道-一般而言,这基本上适用于所有集体操作-在执行MPI_Barrier之前通信上下文的某些进程刚刚(成功)退出的情况下,必须如何解释此断言:例如,假设我们有两个方法A和B和用MPI_COMM_WORLD作为通信和参数commMPI_Barrier After A and B call MPI_Init , if B immediately calls MPI_Finalize and exits, and if only A calls MPI_Barrier before calling MPI_Finalize , is A blocked for eternity? 在A和B调用MPI_Init ,如果B立即调用MPI_Finalize并退出,并且如果只有A在调用MPI_Barrier之前调用MPI_Finalize ,那么A是否会被永久阻止? Or is the set of "all group members" defined as the set of all original group members which have not exited, yet? 还是将“所有组成员”的集合定义为尚未退出的所有原始组成员的集合? I'm pretty sure A is blocked forever, but maybe the MPI standard has more to say about this? 我很确定A将永远被阻止,但是也许MPI标准对此还有更多话要说?

REMARK: This is not a question about the synchronizing properties of MPI_Barrier , the reference to MPI_Barrier is merely meant to be a concrete example. 备注:这不是一个有关的同步特性的问题MPI_Barrier ,到基准MPI_Barrier仅仅意味着是一个具体的例子。 It is a question about MPI program correctness if collective operations are performed. 是否执行集体操作是一个有关MPI程序正确性的问题。 See the comments. 查看评论。

If B exits right at program start and only A calls MPI_Barrier, is A blocked for eternity? 如果B在程序启动时立即退出,并且只有A调用MPI_Barrier,那么A是否被永久阻止?

Basically yes. 基本上是。 But actually, you are not allowed to do that. 但实际上,您是不允许这样做的。

Simply speaking, you must call MPI_Finalize on all processes before exiting. 简而言之,您必须在退出之前在所有进程上调用MPI_Finalize And MPI_Finalize acts like a collective (on MPI_COMM_WORLD ), so it usually does not complete before every process calls MPI_Finalize . 而且MPI_Finalize行为就像一个集合(在MPI_COMM_WORLD ),因此它通常不会在每个进程调用MPI_Finalize之前完成。 So in your example, process B didn't exit (at least not correctly). 因此,在您的示例中,进程B没有退出(至少没有正确退出)。

But I guess the MPI 3.1 standard at 8.7 explains it more clearly: 但是我想MPI 3.1标准在8.7可以更清楚地解释它:

MPI_Finalize [...] This routine cleans up all MPI state. MPI_Finalize [...]此例程清除所有MPI状态。 If an MPI program terminates normally (ie, not due to a call to MPI_ABORT or an unrecoverable error) then each process must call MPI_FINALIZE before it exits. 如果MPI程序正常终止(即,不是由于对MPI_ABORT的调用或不可恢复的错误引起的),则每个进程必须在退出前调用MPI_FINALIZE Before an MPI process invokes MPI_FINALIZE , the process must perform all MPI calls needed to complete its involvement in MPI communications: It must locally complete all MPI operations that it initiated and must execute matching calls needed to complete MPI communications initiated by other processes. 在MPI流程调用MPI_FINALIZE之前,该流程必须执行完成其参与MPI通信所需的所有MPI调用:它必须本地完成它发起的所有MPI操作,并且必须执行完成由其他流程发起的MPI通信所需的匹配调用。

Note how the last sentence also requires you to complete the barrier in your question. 请注意最后一句话还如何要求您完成问题中的障碍。

The standard says, your program is not correct. 该标准说,您的程序不正确。 In practice it will most likely deadlock/hang. 实际上,它很可能会死锁/挂起。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM