简体   繁体   English

如何在失败时自动重启python脚本?

[英]How to auto-restart a python script on fail?

This post describes how to keep a child process alive in a BASH script: 这篇文章描述了如何在BASH脚本中保持子进程的活动:

How do I write a bash script to restart a process if it dies? 如果进程死了,如何编写bash脚本来重启?

This worked great for calling another BASH script. 这非常适合调用另一个BASH脚本。

However, I tried executing something similar where the child process is a Python script, daemon.py which creates a forked child process which runs in the background: 但是,我尝试执行类似于子进程是Python脚本的东西,daemon.py创建了一个在后台运行的分叉子进程:

#!/bin/bash

PYTHON=/usr/bin/python2.6

function myprocess {


$PYTHON daemon.py start

}
NOW=$(date +"%b-%d-%y")

until myprocess; do
     echo "$NOW Prog crashed. Restarting..." >> error.txt
     sleep 1
done

Now the behaviour is completely different. 现在行为完全不同了。 It seems the python script is no longer a child of of the bash script but seems to have 'taken over' the BASH scripts PID - so there is no longer a BASH wrapper round the called script...why? 似乎python脚本不再是bash脚本的子代,但似乎已经“接管”了BASH脚本PID - 所以不再有一个围绕被调用脚本的BASH包装器......为什么?

A daemon process double-forks, as the key point of daemonizing itself -- so the PID that the parent-process has is of no value (it's gone away very soon after the child process started). 一个守护进程双叉,作为守护自身的关键点 - 所以父进程所具有的PID没有价值(它在子进程启动后很快就消失了)。

Therefore, a daemon process should write its PID to a file in a "well-known location" where by convention the parent process knows where to read it from; 因此,守护进程应该将其PID写入“知名位置”中的文件,按照惯例,父进程知道从哪里读取它; with this (traditional) approach, the parent process, if it wants to act as a restarting watchdog, can simply read the daemon process's PID from the well-known location and periodically check if the daemon is still alive, and restart it when needed. 使用这种(传统)方法,父进程,如果它想要作为重启监视程序,可以简单地从众所周知的位置读取守护进程的PID,并定期检查守护进程是否仍然存活,并在需要时重新启动它。

It takes some care in execution, of course (a "stale" PID will stay in the "well known location" file for a while and the parent must take that into account), and there are possible variants (the daemon could emit a "heartbeat" so that the parent can detect not just dead daemons, but also ones that are "stuck forever", eg due to a deadlock, since they stop giving their "heartbeat" [[via UDP broadcast or the like]] -- etc etc), but that's the general idea. 当然,执行需要一些小心(“陈旧的”PID将保留在“众所周知的位置”文件中一段时间​​,父母必须考虑到这一点),并且有可能的变体(守护进程可以发出“心跳“以便父母不仅可以检测到死亡的守护进程,而且可以检测到”永远停留“的守护进程,例如由于死锁,因为他们停止发出”心跳“[[通过UDP广播等]] - 等等等),但这是一般的想法。

You should look at the Python Enhancement Proposal 3143 (PEP) here . 您应该在这里查看Python Enhancement Proposal 3143(PEP)。 In it Ben suggests including a daemon library in the python standard lib. 在其中Ben建议在python标准库中包含一个守护进程库。 He goes over LOTS of very good information about daemons and is a pretty easy read. 他浏览了很多关于守护进程的非常好的信息,并且非常容易阅读。 The reference implementation is here . 参考实现在这里

It seems that the behavior is completely different because here your "daemon.py" is launched in background as a daemon. 似乎行为完全不同,因为这里你的“daemon.py”在后台作为守护进程启动。

In the other link you pointed to the process that is surveyed is not a daemon, it does not start in the background. 在另一个链接中,您指向被调查的进程不是守护进程,它不会在后台启动。 The launcher simply wait forever that the child process stop. 启动器只是等待子进程停止。

There is several ways to overcome this. 有几种方法可以解决这个问题。 The classical one is the way @Alex explain, using some pid file in conventional places. 经典的是@Alex解释的方式,在传统的地方使用一些pid文件。

Another way could be to build the watchdog inside your running daemon and daemonize the watchdog... this would simulate a correct process that do not break at random (something that shouldn't occur)... 另一种方法是在你正在运行的守护进程中构建看门狗并守护看门狗...这将模拟一个不随意中断的正确进程(不应该发生的事情)......

Make use of ' https://github.com/ut0mt8/simple-ha ' . 利用' https://github.com/ut0mt8/simple-ha '。

simple-ha 简单公顷

Tired of keepalived, corosync, pacemaker, heartbeat or whatever ? 厌倦了keepalived,corosync,心脏起搏器,心跳或其他什么? Here a simple daemon wich ensure a Heartbeat between two hosts. 这是一个简单的守护进程,可以确保两个主机之间的心跳。 One is active, and the other is backup, launching script when changing state. 一个是活动的,另一个是备份,在更改状态时启动脚本。 Simple implementation, KISS. 简单的实施,KISS。 Production ready (at least it works for me :) 准备好生产(至少它对我有用:)

Life will be too easy ! 生活太容易了!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM