简体   繁体   English

TProcessPoolServer 正常关闭?

[英]TProcessPoolServer graceful shutdown?

How do you gracefully shutdown the python thrift server, TProcessPoolServer?你如何优雅地关闭 python thrift 服务器,TProcessPoolServer? I haven't found any documentation, examples, or blog posts.我还没有找到任何文档、示例或博客文章。 What follows are my experiences, so far.以下是我迄今为止的经验。

I'm running my thrift server directly on the command line, ./thrift_service.py, not under a supervisor.我直接在命令行 ./thrift_service.py 上运行我的 thrift 服务器,而不是在主管下。 I'm using python 2.6 and thrift 0.8.0.我正在使用 python 2.6 和 thrift 0.8.0。

I initially tried:我最初尝试:

server = TProcessPoolServer(processor, transport, tfactory, pfactory)
try:
    server.serve()
finally:
    server.stop()

When I send sigterm the parent python process, I see "Terminated" in the output, the process is killed, but its children are orphaned and continue to run.当我向父 python 进程发送 sigterm 时,我在输出中看到“已终止”,进程被终止,但它的子进程被孤立并继续运行。

Then I stumbled across the thrift server tests , and tried:然后我偶然发现了thrift server tests ,并尝试了:

import signal
def set_alarm(server):
    def clean_shutdown(signum, frame):
        for worker in server.workers:
            logging.error("Terminating worker: {0}".format(worker))
            worker.terminate()
        logging.error("Requesting server to stop()")
        try:
            server.stop()
        except (KeyboardInterrupt, SystemExit):
            pass
        except Exception as err:
            logging.exception(err)
    def logme(s, *args, **kwargs):
        logging.error(">>> {0} <<<".format(s))
        clean_shutdown(*args, **kwargs)
    signal.signal(signal.SIGALRM, clean_shutdown)
    signal.signal(signal.SIGHUP, clean_shutdown)
    signal.signal(signal.SIGINT, clean_shutdown)
    signal.signal(signal.SIGTERM, lambda x, y: logme("SIGTERM", x, y))
server = TProcessPoolServer(processor, transport, tfactory, pfactory)
set_alarm(server)
server.serve()

and when I send sigterm, sigalrm, sighup, or sigint to the parent python process, the server stops accepting connections, but the processes are not terminated.当我向父 python 进程发送 sigterm、sigalrm、sighup 或 sigint 时,服务器停止接受连接,但进程并未终止。

In the output I see:在输出中我看到:

ERROR:root:>>> SIGTERM <<<
ERROR:root:Terminating worker: <Process(Process-1, started daemon)>
ERROR:root:Terminating worker: <Process(Process-2, started daemon)>
ERROR:root:Terminating worker: <Process(Process-3, started daemon)>
ERROR:root:Terminating worker: <Process(Process-4, started daemon)>
ERROR:root:Terminating worker: <Process(Process-5, started daemon)>
ERROR:root:Requesting server to stop()

which is expected, but then the signal is caught again, the processes aren't in a started state anymore, and the server is asked to stop.这是预期的,但随后再次捕获信号,进程不再处于启动状态,并要求服务器停止。 This part happens around ten times and then there is no more output.这部分发生了大约十次,然后就没有输出了。

ERROR:root:>>> SIGTERM <<<
ERROR:root:Terminating worker: <Process(Process-1, unknown daemon)>
ERROR:root:Requesting server to stop()

And sometimes, I see an AssertionError from within the multiprocessing library:有时,我会在多处理库中看到 AssertionError:

Traceback (most recent call last):
  File "/path/to/thrift_service.py", line 340, in clean_shutdown
    server.stop()
  File "/usr/local/lib/python2.6/dist-packages/thrift/server/TProcessPoolServer.py", line 123, in stop
    self.stopCondition.notify()
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 223, in notify
    assert not self._wait_semaphore.acquire(False)
AssertionError

I have added a graceful shutdown to a TProcessPoolServer in python using signals and the postForkCallback that it exposes.我使用信号和它公开的 postForkCallback 在 python 中向 TProcessPoolServer 添加了正常关闭。 The TProcessPoolServer will call your postForkCallback in each worker process once it has initialized.一旦初始化,TProcessPoolServer 将在每个工作进程中调用您的 postForkCallback。 This allows you to setup signal handlers and shutdown gracefully.这允许您设置信号处理程序并正常关闭。 Since the workers catches either the SystemExit or KeyboardInterruptException exceptions you can setup a handler for SIGINT and then once you have finished cleaning up call sys.exit(0) and that will cause the worker to shutdown.由于工作人员捕获 SystemExit 或 KeyboardInterruptException 异常,您可以为 SIGINT 设置一个处理程序,然后一旦您完成清理调用 sys.exit(0) ,这将导致工作人员关闭。

import signal
import sys

def setupHandlers():
    signal.signal(signal.SIGINT, handleSIGINT)
    #Optionally if you want to keep the current socket connection open and working
    #tell python to make system calls non-interruptable, which is probably what you want.
    signal.siginterrupt(signal.SIGINT, False)

def handleSIGINT(sig, frame):
     #clean up state or what ever is necessary
     sys.exit(0)

server = TProcessPoolServer(processor, transport, tfactory, pfactory)
server.setPostForkCallback(setupHandlers)

#Setup handlers in main process too
setupHandlers()

#Start server
server.start()

This way every worker process that is spawned sets the signal handlers to correctly handle the graceful shutdown.这样,产生的每个工作进程都会设置信号处理程序以正确处理正常关闭。 In this example I set the same handler for the main process as well as the workers which may work depending on you use case, but you can easily define a different handler for the main process if you need to.在这个例子中,我为主进程和工作进程设置了相同的处理程序,这些处理程序可能会根据您的用例工作,但是如果需要,您可以轻松地为主进程定义不同的处理程序。 And remember that the handler will be called from the context of each process so you won't be able to share state across process during clean up.请记住,处理程序将从每个进程的上下文中调用,因此您将无法在清理期间跨进程共享状态。

see http://docs.python.org/library/signal.html for more details on what signal.siginterrupt does and why you may need it.请参阅http://docs.python.org/library/signal.html以了解有关 signal.siginterrupt 的作用以及您可能需要它的原因的更多详细信息。

Edit: You will need to send the SIGINT signal to all of the process using Crtl + C or if it is running as a daemon kill -SIGINT [pids of all processes]编辑:您需要使用 Crtl + C 将 SIGINT 信号发送到所有进程,或者如果它作为守护进程运行 kill -SIGINT [所有进程的pids]

You can get the pids of the workers easily using ps --ppid [parent pid]您可以使用 ps --ppid [parent pid] 轻松获取工人的 pid

After the program started, I recorded the process number of the main process.程序启动后,我记录了主进程的进程号。 Then according to ps --ppid , get back the child processes of the main process and kill them one by one.然后根据ps --ppid ,取回主进程的子进程,一一kill掉。

The code of the control shell script of my service:我的服务的控制shell脚本的代码:

function stop
{
    SERVER_PID=`cat logs/server.pid`
    SPIDS=`ps --ppid $SERVER_PID | awk '{if ($1!="PID") print $1}'`
    kill -9 $SERVER_PID
    for PID in $SPIDS
    do
        kill -9 $PID
    done
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM