python通过`while true`循环意外退出后重新启动

Question

l wrote a python script with while true to do task to catch email's attachment, but sometimes l found out it would exit unexpectedly on server. 我写了一个Python脚本，它的while true来执行任务以捕获电子邮件的附件，但是有时我发现它会在服务器上意外退出。

l run it on my local for more than 4 hours with no problem, so l can confirm that the code is correct. 我可以在我的本地计算机上运行4个多小时而没有问题，因此可以确认代码正确无误。

So is there a kind of mechanism to restart python when it exit unexpectedly, such as process monitoring? 那么，有一种机制可以在异常退出时重新启动python，例如进程监视吗？ l am a novice in linux. 我是linux的新手。

remark: l run this python script like python attachment.py & in a shell script. 备注：l在外壳程序脚本中运行此python脚本，如python attachment.py & 。

Answer 1

While @triplee's comment will definitely do the trick, I would worry that there is something going on that you would be better-off understanding. 虽然@triplee的评论肯定可以解决问题，但我担心会发生某些事情，您可能会有所了解。 That is, why the script is failing. 也就是说，脚本失败的原因。

Without further details, it's difficult to speculate what might be happening. 没有更多细节，很难推测可能会发生什么。 As a first debugging effort, you might try wrapping the entire body within the while True in a try ... except... block, and use the except block to log the error and/or the program state. 作为第一个调试工作，您可以尝试在try ... except...块中将while True换行内包裹整个主体，然后使用except块记录错误和/或程序状态。 That is, 那是，

while True:
    try:
        ... do some stuff...
    except:
        ... log the exception, print to screen, record the values of key variables, etc.
        continue

This would allow you to understand what is happening during the failure, and to write more robust code that handles that event. 这将使您了解故障期间发生的情况，并编写处理该事件的更强大的代码。

Answer 2

You can try to use Supervisor to manage your process. 您可以尝试使用Supervisor来管理您的过程。 The Supervisor able to configure the bevhiour of the process exit status and try to restart it. 主管可以配置进程退出状态的行为并尝试重新启动它。

Attached is the official document and the example in Ubuntu : 随附的是官方文档和Ubuntu中的示例：

example configuration 示例配置

[program:nodehook]
command=/usr/bin/node /srv/http.js
directory=/srv
autostart=true
autorestart=true
startretries=3
stderr_logfile=/var/log/webhook/nodehook.err.log
stdout_logfile=/var/log/webhook/nodehook.out.log
user=www-data
environment=SECRET_PASSPHRASE='this is secret',SECRET_TWO='another secret

Answer 3

l run it on my local for more than 4 hours with no problem, so l can confirm that the code is correct. 我可以在我的本地计算机上运行4个多小时而没有问题，因此可以确认代码正确无误。

You could be surprised by the number of bugs that only reveals after months if not years of correct processing... What you confirm is that the code does not break on first action, but unless you have tested it with all possible corner cases in input (including badly formatted ones) you cannot confirm that it will never break. 您可能会惊讶地发现，如果不是经过数年的正确处理，几个月后才会发现的错误数量……您确认的是，代码不会在第一次操作时就中断，但是除非您在输入中使用了所有可能的极端情况进行了测试，（包括格式错误的文件），您无法确认它永远不会损坏。

That is the reason why a program that is intented to run unattendedly should be carefully designed to always (try to ^* ) leave a trace before exiting. 这就是为什么应谨慎设计旨在无人照管运行的程序，以便始终（尝试^* ）在退出之前留下痕迹。 try: except: and the logging module are your best friends here. try: except:和logging模块是您最好的朋友。

^* Of cause in case of a system crash or a power outage there's nothing you can do at user program level... ^*如果发生系统崩溃或断电，则无法在用户程序级别执行任何操作...

python通过`while true`循环意外退出后重新启动

问题描述

3 个解决方案

解决方案1
2 2017-11-22 08:12:56

解决方案2
0 2017-11-22 08:06:47

解决方案3
0 2017-11-22 08:22:46

python通过`while true`循环意外退出后重新启动

问题描述

3 个解决方案

解决方案1 2 2017-11-22 08:12:56

解决方案2 0 2017-11-22 08:06:47

解决方案3 0 2017-11-22 08:22:46

解决方案1
2 2017-11-22 08:12:56

解决方案2
0 2017-11-22 08:06:47

解决方案3
0 2017-11-22 08:22:46