简体   繁体   English

Flask - 作业未作为后台进程运行

[英]Flask - job not running as a background process

I am trying to run a Flask app which consists of:我正在尝试运行一个Flask应用程序,其中包括:

  1. Yielding API requests on the fly即时产生 API 请求
  2. Uploading each request to a SQLalchemy database将每个请求上传到SQLalchemy数据库
  3. Run jobs 1 and 2 as a background process将作业12作为后台进程运行

For that I have the following code:为此,我有以下代码:

import concurrent.futures
import queue
from concurrent.futures import ThreadPoolExecutor

from flask import Flask, current_app

app = Flask(__name__)
q = queue.Queue()


def build_cache():
    # 1. Yielding API requests on the fly
    track_and_features = spotify.query_tracks()  # <- a generator
    while True:
        q.put(next(track_and_features))


def upload_cache(tracks_and_features):
    # 2. Uploading each request to a `SQLalchemy` database
    with app.app_context():
        Upload_Tracks(filtered_dataset=track_and_features)

    return "UPLOADING TRACKS TO DATABASE"


@app.route("/cache")
def cache():
    # 3. Do `1` and `2` as a background process
    with concurrent.futures.ThreadPoolExecutor() as executor:

        future_to_track = {executor.submit(build_cache): "TRACKER DONE"}

        while future_to_track:
            # check for status of the futures which are currently working
            done, not_done = concurrent.futures.wait(
                future_to_track,
                timeout=0.25,
                return_when=concurrent.futures.FIRST_COMPLETED,
            )

            # if there is incoming work, start a new future
            while not q.empty():

                # fetch a track from the queue
                track = q.get()

                # Start the load operation and mark the future with its TRACK
                future_to_track[executor.submit(upload_cache, track)] = track
            # process any completed futures
            for future in done:
                track = future_to_track[future]
                try:
                    data = future.result()
                except Exception as exc:
                    print("%r generated an exception: %s" % (track, exc))

                del future_to_track[future]

    return "Cacheing playlist in the background..."

All of the above works, BUT NOT AS A BACKGROUND PROCESS.以上所有工作,但不是作为一个背景过程。 The app hangs when cache() is called, and resumes only when the process is done.应用程序在cache()被调用时挂起,只有在进程完成后才恢复。

I run it with gunicorn -c gconfig.py app:app -w 4 --threads 12我用gunicorn -c gconfig.py app:app -w 4 --threads 12运行它

what am I doing wrong?我究竟做错了什么?


EDIT : If simplify things in order do debug this, and write simply:编辑:如果为了简化事情而进行调试,请简单地编写:

# 1st background process
def build_cache():
    # only ONE JOB
    tracks_and_features = spotify.query_tracks()  # <- not a generator
    while True:
        print(next(tracks_and_features))


# background cache
@app.route("/cache")
def cache():
    executor.submit(build_cache)
    return "Cacheing playlist in the background..."

THEN the process runs in the background.然后进程在后台运行。

However, if I add another job:但是,如果我添加另一份工作:

def build_cache():

    tracks_and_features = spotify.query_tracks()
    while True:
        # SQLalchemy db
        Upload_Tracks(filtered_dataset=next(tracks_and_features))

background does not work again.背景再次不起作用。

In short:简而言之:

Background only works if I run ONE job at a time (which was the limitation behind the idea of using queues in the first place) .背景仅在我一次运行一个作业时才有效(这是首先使用队列的想法背后的限制)

seems like the problem is binding the background process to SQLalchemy, don't know.问题似乎是将后台进程绑定到 SQLalchemy,不知道。 totally lost here.完全迷失在这里。

Still not sure what you meant by仍然不确定你的意思

I mean the app waits for all requests to be made at login and only then goes to homepage.我的意思是该应用程序等待所有请求在登录时发出,然后才转到主页。 It should go right away to homepage with requests being made at background它应该立即转到主页,并在后台提出请求

There are a few issues here:这里有几个问题:

  • Your queue is global to the process ie there is only one queue per gunicorn worker;您的队列对进程来说是全局的,每个 gunicorn 工人只有一个队列; you probably want the queue to be bound to your request so that multiple requests are not sharing the same queue in memory.您可能希望将队列绑定到您的请求,以便多个请求不会在内存中共享同一个队列。 Consider using context locals考虑使用上下文局部变量
  • If UploadTracks is writing to the database, there might be a lock on the table.如果UploadTracks正在写入数据库,则表上可能有锁。 Check your indices and inspect lock waits in your database.检查索引并检查数据库中的锁定等待。
  • SQLAlchemy might be configured with a small connection pool , and the second UploadTracks is waiting for the first to return its connection. SQLAlchemy 可能配置了一个小的连接池,第二个UploadTracks正在等待第一个返回其连接。

In your first example, the endpoint is waiting on all futures to finish before returning, whereas in your second example, the endpoint returns immediately after submitting tasks to the executor.在您的第一个示例中,端点在返回之前等待所有期货完成,而在您的第二个示例中,端点在将任务提交给执行程序后立即返回。 If you want flask to respond quickly while the tasks are still running in background threads, remove the with concurrent.futures.ThreadPoolExecutor() as executor: and construct a global thread pool at the top of the module.如果你想让flask在任务仍在后台线程中运行时快速响应,删除with concurrent.futures.ThreadPoolExecutor() as executor:并在模块顶部构建一个全局线程池。

Using with , the context manager waits for all submitted tasks before exiting, but I am not sure if that's your main issue.使用with ,上下文管理器在退出之前等待所有提交的任务,但我不确定这是否是您的主要问题。

Try to create the ThreadPoolExecutor outside of the route handler.尝试在路由处理程序之外创建ThreadPoolExecutor

import time
from concurrent.futures import ThreadPoolExecutor

from flask import Flask


def foo(*args):
    while True:
        print("foo", args)
        time.sleep(10)


app = Flask(__name__)

executor = ThreadPoolExecutor()


@app.route("/cache")
def cache():
    executor.submit(foo, "1")
    executor.submit(foo, "2")
    return "in cache"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM