简体   繁体   中英

MPI signal handling

When using mpirun , is it possible to catch signals (for example, the SIGINT generated by ^C ) in the code being run?

For example, I'm running a parallelized python code. I can except KeyboardInterrupt to catch those errors when running python blah.py by itself, but I can't when doing mpirun -np 1 python blah.py .

Does anyone have a suggestion? Even finding how to catch signals in a C or C++ compiled program would be a helpful start.

If I send a signal to the spawned Python processes, they can handle the signals properly; however, signals sent to the parent orterun process (ie from exceeding wall time on a cluster, or pressing control-C in a terminal) will kill everything immediately.

I think it is really implementation dependent.

SIGINT, SIGUSR1, SIGUSR2 will be bypassed to processes.

I_MPI_JOB_SIGNAL_PROPAGATION and I_MPI_JOB_TIMEOUT_SIGNAL can be set to send signal.

Another thing worth notice: For many python scripts, they will invoke other library or codes through cython, and if the SIGUSR1 is caught by the sub-process, something unwanted might happen.

If you use mpirun --nw , then mpirun itself should terminate as soon as it's started the subprocesses, instead of waiting for their termination; if that's acceptable then I believe your processes would be able to catch their own signals.

The signal module supports setting signal handlers using signal.signal :

Set the handler for signal signalnum to the function handler. handler can be a callable Python object taking two arguments (see below), or one of the special values signal.SIG_IGN or signal.SIG_DFL. The previous signal handler will be returned...

import signal
def ignore(sig, stack):
  print "I'm ignoring signal %d" % (sig, )

signal.signal(signal.SIGINT, ignore)
while True: pass

If you send a SIGINT to a Python interpreter running this script (via kill -INT <pid> ), it will print a message and simply continue to run.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM