简体   繁体   English

Python多处理-看门狗进程?

[英]Python multiprocessing - watchdog process?

I have a set of long-running process in a typical "pub/sub" setup with queues for communication. 在典型的“发布/订阅”设置中,我有一组长时间运行的过程,带有通信队列。

I would like to do two things, and I can't figure out how to accomplish both simultaneously: 我想做两件事,但我想不出如何同时完成两项工作:

  1. Addition/removal of workers. 增/减工人。 For example, I want to be able to add extra consumers if I see that my pending queue size has grown too large. 例如,如果我看到待处理的队列大小变得太大,我希望能够添加额外的使用者。
  2. Watchdog for my processes - I want to be notified if any of my producers or consumers crashes. 我的流程的监视者-如果任何生产者或消费者崩溃,我都希望得到通知。

I can do (2) in isolation: 我可以单独进行(2):

try:
    while True:
        for process in workers + consumers:
            if not process.is_alive():
                logger.critical("%-8s%s died!", process.pid, process.name)
        sleep(3)
except KeyboardInterrupt:
    # Python propagates CTRL+C to all workers, no need to terminate them
    logger.warn('Received CTR+C, shutting down')

The above blocks, which prevents me from doing (1). 上面的代码块阻止了我执行(1)。

So I decided to move the code into its own process. 因此,我决定将代码移入自己的过程。

This doesn't work, because process.is_alive() only works for a parent checking the status of its children. 这是行不通的,因为process.is_alive()仅适用于检查其子级状态的父级。 In this case, the processes I want to check would be siblings instead of children. 在这种情况下,我要检查的进程将是同级而不是子进程。

I'm a bit stumped on how to proceed. 我对如何进行感到有些困惑。 How can my main process support changes to subprocesses while also monitoring subprocesses? 我的主流程如何在支持子流程的同时监视子流程?

multiprocessing.Pool actually has a watchdog built-in already. 实际上, multiprocessing.Pool实际上已经内置了一个看门狗。 It runs a thread that checks every 0.1 seconds to see if a worker has died. 它运行一个线程,该线程每隔0.1秒检查一次,看是否有工人死亡。 If it has, it starts a new one to take its place: 如果有,它将启动一个新的位置来代替它:

def _handle_workers(pool):
    thread = threading.current_thread()

    # Keep maintaining workers until the cache gets drained, unless the pool
    # is terminated.
    while thread._state == RUN or (pool._cache and thread._state != TERMINATE):
        pool._maintain_pool()
        time.sleep(0.1)
    # send sentinel to stop workers
    pool._taskqueue.put(None)
    debug('worker handler exiting')

def _maintain_pool(self):
    """Clean up any exited workers and start replacements for them.
    """
    if self._join_exited_workers():
        self._repopulate_pool()

This is primarily used to implement the maxtasksperchild keyword argument, and is actually problematic in some cases. 这主要用于实现maxtasksperchild关键字参数,在某些情况下实际上存在问题。 If a process dies while a map or apply command is running, and that process is in the middle of handling a task associated with that call, it will never finish. 如果某个进程在运行mapapply命令时死亡,并且该进程正在处理与该调用相关的任务,则它将永远不会完成。 See this question for more information about that behavior. 有关行为的更多信息,请参见此问题

That said, if you just want to know that a process has died, you can just create a thread (not a process) that monitors the pids of all the processes in the pool, and if the pids in the list ever change, you know a process has crashed: 就是说,如果您只想知道某个进程已死,则可以创建一个线程(而不是进程)来监视池中所有进程的pid,并且如果列表中的pid曾经更改,您就知道进程崩溃:

def monitor_pids(pool):
    pids = [p.pid for p in pool._pool]
    while True:
      new_pids = [p.pid for p in pool._pool]
      if new_pids != pids:
          print("A worker died")
          pids = new_pids
      time.sleep(3)

Edit: 编辑:

If you're rolling your own Pool implementation, you can just take a cue from multiprocessing.Pool , and run your monitoring code in a background thread in the parent process. 如果要滚动自己的Pool实现,则可以从multiprocessing.Pool获取提示,并在父进程的后台线程中运行监视代码。 The checks to see if the processes are still running are quick, so the time lost to the background thread taking the GIL should be negligible. 检查进程是否仍在运行的速度很快,因此,采用GIL的后台线程所浪费的时间可以忽略不计。 Consider that the multiprocessing.Process watchdog is running every 0.1 seconds! 考虑一下multiprocessing.Process看门狗每0.1秒运行一次! Running yours every 3 seconds shouldn't cause any problems. 每3秒钟运行一次不会造成任何问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM