如何在Python Flask框架中运行重复任务？

Question

I'm building a website which provides some information to the visitors. 我正在建立一个向访问者提供一些信息的网站。 This information is aggregated in the background by polling a couple external APIs every 5 seconds. 此信息通过每5秒轮询一对外部API在后台聚合。 The way I have it working now is that I use APScheduler jobs. 我现在的工作方式是我使用APScheduler工作。 I initially preferred APScheduler because it makes the whole system more easy to port (since I don't need to set cron jobs on the new machine). 我最初更喜欢APScheduler，因为它使整个系统更容易移植（因为我不需要在新机器上设置cron作业）。 I start the polling functions as follows: 我启动轮询功能如下：

from apscheduler.scheduler import Scheduler

@app.before_first_request
def initialize():
    apsched = Scheduler()
    apsched.start()

    apsched.add_interval_job(checkFirstAPI, seconds=5)
    apsched.add_interval_job(checkSecondAPI, seconds=5)
    apsched.add_interval_job(checkThirdAPI, seconds=5)

This kinda works, but there's some trouble with it: 这种方法有用，但它有一些问题：

For starters, this means that the interval-jobs are running outside of the Flask context. 对于初学者来说，这意味着interval-jobs在Flask上下文之外运行。 So far this hasn't been much of a problem, but when calling an endpoint fails I want the system to send me an email (saying "hey calling API X failed"). 到目前为止，这并没有太大问题，但是当调用端点失败时，我希望系统向我发送一封电子邮件（说“嘿，调用API X失败”）。 Because it doesn't run within the Flask context however, it complaints that flask-mail cannot be executed ( RuntimeError('working outside of application context') ). 但是因为它不在Flask上下文中运行，所以它抱怨无法执行flask-mail （ RuntimeError('working outside of application context') ）。
Secondly, I wonder how this is going to behave when I don't use the Flask built-in debug server anymore, but a production server with lets say 4 workers. 其次，我想知道当我不再使用Flask内置调试服务器时这将会如何表现，但生产服务器可以说4个工作人员。 Will it start every job four times then? 那会是每次工作四次吗？

All in all I feel that there should be a better way of running these recurring tasks, but I'm unsure how. 总而言之，我觉得应该有更好的方法来运行这些重复的任务，但我不确定如何。 Does anybody out there have an interesting solution to this problem? 有没有人有这个问题的有趣解决方案？ All tips are welcome! 欢迎所有提示！

[EDIT] I've just been reading about Celery with its schedules . [编辑]我刚刚阅读了Celery及其时间表。 Although I don't really see how Celery is different from APScheduler and whether it could thus solve my two points, I wonder if anyone reading this thinks that I should investigate more in Celery? 虽然我没有真正看到Celery与APScheduler的区别，以及它是否可以解决我的两点，但我想知道是否有人在阅读这篇文章时认为我应该在Celery中进行更多调查？

[CONCLUSION] About two years later I'm reading this, and I thought I could let you guys know what I ended up with. [结论]大约两年后，我正在读这篇文章，我想我可以让你们知道我最终得到了什么。 I figured that @BluePeppers was right in saying that I shouldn't be tied so closely to the Flask ecosystem. 我认为@BluePeppers说我不应该与Flask生态系统紧密联系。 So I opted for regular cron-jobs running every minute which are set using Ansible. 所以我选择使用Ansible设置的每分钟运行的常规cron-jobs。 Although this makes it a bit more complex (I needed to learn Ansible and convert some code so that running it every minute would be enough) I think this is more robust. 虽然这使它变得有点复杂（我需要学习Ansible并转换一些代码以便每分钟运行它就足够了）我认为这更加强大。 I'm currently using the awesome pythonr-rq for queueing a-sync jobs (checking APIs and sending emails). 我目前正在使用令人敬畏的pythonr-rq来排队同步作业（检查API和发送电子邮件）。 I just found out about rq-scheduler . 我刚刚发现了rq-scheduler 。 I haven't tested it yet, but it seems to do precisely what I needed in the first place. 我还没有测试过，但它似乎正是我所需要的。 So maybe this is a tip for future readers of this question. 所以这可能是未来读者对这个问题的一个提示。

For the rest, I just wish all of you a beautiful day! 其余的，我希望你们大家度过美好的一天！

Answer 1

(1) （1）

You can use the app.app_context() context manager to set the application context. 您可以使用app.app_context()上下文管理器来设置应用程序上下文。 I imagine usage would go something like this: 我想用法会是这样的：

from apscheduler.scheduler import Scheduler

def checkSecondApi():
    with app.app_context():
        # Do whatever you were doing to check the second API

@app.before_first_request
def initialize():
    apsched = Scheduler()
    apsched.start()

    apsched.add_interval_job(checkFirstAPI, seconds=5)
    apsched.add_interval_job(checkSecondAPI, seconds=5)
    apsched.add_interval_job(checkThirdAPI, seconds=5)

Alternatively, you could use a decorator 或者，您可以使用装饰器

def with_application_context(app):
    def inner(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            with app.app_context():
                return func(*args, **kwargs)
        return wrapper
    return inner

@with_application_context(app)
def checkFirstAPI():
    # Check the first API as before

(2) （2）

Yes it will still work. 是的，它仍然有效。 The sole (significant) difference is that your application will not be communicating directly with the world; 唯一（重大）差异是您的申请不会直接与世界沟通; it will be going through a reverse proxy or something via fastcgi/uwsgi/whatever. 它将通过反向代理或通过fastcgi / uwsgi /等等。 The only concern is that if you have multiple instances of the app starting, then multiple schedulers will be created. 唯一的问题是，如果您有多个应用程序启动实例，那么将创建多个调度程序。 To manage this, I would suggest you move your backend tasks out of the Flask application, and use a tool designed for running tasks regularly (ie Celery). 为了解决这个问题，我建议你将后端任务从Flask应用程序中移出，并使用专为定期运行任务而设计的工具（即Celery）。 The downside to this is that you won't be able to use things like Flask-Mail, but imo, it's not too good to be so closely tied to the Flask ecosystem; 这样做的缺点是你将无法使用像Flask-Mail这样的东西，但是imo，与Flask生态系统紧密联系并不是太好了; what are you gaining by using Flask-Mail over a standard, non Flask, mail library? 你在Flask-Mail上使用标准的非Flask邮件库获得了什么？

Also, breaking up your application makes it much easier to scale up individual components as the capacity is required, compared to having one monolithic web application. 此外，与拥有一个单一的Web应用程序相比，分解应用程序可以更容易地扩展单个组件，因为需要容量。

如何在Python Flask框架中运行重复任务？

问题描述

1 个解决方案

解决方案1
24 已采纳 2014-09-03 08:45:39

如何在Python Flask框架中运行重复任务？

问题描述

1 个解决方案

解决方案1 24 已采纳 2014-09-03 08:45:39

解决方案1
24 已采纳 2014-09-03 08:45:39