简体   繁体   中英

multiprocessing.Queue hanging when Process dies

I have a subprocess via multiprocessing.Process and a queue via multiprocessing.Queue .

The main process is using multiprocessing.Queue.get() to get some new data. I don't want to have a timeout there and I want it to be blocking.

However, when the child process dies for whatever reason (manually killed by user via kill , or segfault, etc.), Queue.get() just will hang forever.

How can I avoid that?

I think multiprocessing.Queue is not what I want.

I'm using now

parent_conn, child_conn = multiprocessing.Pipe(duplex=True)

to get two multiprocessing.Connection objects. Then I os.fork() or use multiprocessing.Process . In the child, I do:

parent_conn.close()
# read/write on child_conn

In the parent (after the fork), I do:

child_conn.close()
# read/write on parent_conn

That way, when I call recv() on the connection, it will raise an exception ( EOFError ) when the child/parent dies in the meanwhile.

Note that this works only for a single child. I guess Queue is meant when you want multiple childs. In that case, you would probably anyway have some manager which watches whether all childs are alive and restarts them accordingly.

The Queue has no way of knowing when it does not have any possible writers anymore. You could pass the object to any number of subprocesses, and it does not know if you passed it to any given subprocess. So it will have to wait, even if a subprocess dies. A queue is not a file descriptor that is automatically closed when the child dies.

What you are looking for is some kind of supervisor in the parent process that notices when children die unexpectedly and handle that situation in whatever way you think appropriate. You can do this by catching a SIGCHLD process, checking Process.is_alive or using Process.join in a thread. A simple implementation would use the timeout parameter in the Queue.get call and do a Process.is_alive check when that returns.

If you have a bit more control over the death of the child process, it should send an "EOF"-type object ( None , or some kind of marker that it is done) to the queue so your parent process can handle it correctly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM