Flask Celery 任务锁定

Question

I am using Flask with Celery and I am trying to lock a specific task so that it can only be run one at a time.我正在将 Flask 与 Celery 一起使用，我试图锁定一个特定的任务，以便它一次只能运行一个。 In the celery docs it gives a example of doing this Celery docs, Ensuring a task is only executed one at a time .在 celery 文档中，它给出了一个执行此Celery 文档的示例，确保一次只执行一个任务。 This example that was given was for Django however I am using flask I have done my best to convert this to work with Flask however I still see myTask1 which has the lock can be run multiple times.给出的这个示例是针对 Django 的，但是我使用的是 Flask 我已尽力将其转换为与 Flask 一起使用，但是我仍然看到具有锁的 myTask1 可以运行多次。

One thing that is not clear to me is if I am using the cache correctly, I have never used it before so all of it is new to me.我不清楚的一件事是，如果我正确使用缓存，我以前从未使用过它，所以所有这些对我来说都是新的。 One thing from the doc's that is mentioned but not explained is this文档中提到但未解释的一件事是这个

Doc Notes:文档注释：

In order for this to work correctly you need to be using a cache backend where the.add operation is atomic. memcached is known to work well for this purpose.

Im not truly sure what that means, should i be using the cache in conjunction with a database and if so how would I do that?我不太确定那是什么意思，我应该将缓存与数据库结合使用吗？如果是的话，我该怎么做？ I am using mongodb.我正在使用 mongodb。 In my code I just have this setup for the cache cache = Cache(app, config={'CACHE_TYPE': 'simple'}) as that is what was mentioned in the Flask-Cache doc's Flask-Cache Docs在我的代码中，我只是为缓存cache = Cache(app, config={'CACHE_TYPE': 'simple'})设置了这个设置，因为这是 Flask-Cache 文档的Flask-Cache Docs中提到的

Another thing that is not clear to me is if there is anything different I need to do as I am calling my myTask1 from within my Flask route task1我不清楚的另一件事是，当我从 Flask 路由myTask1中调用task1时，是否需要做任何不同的事情

Here is an example of my code that I am using.这是我正在使用的代码示例。

from flask import (Flask, render_template, flash, redirect,
                   url_for, session, logging, request, g, render_template_string, jsonify)
from flask_caching import Cache
from contextlib import contextmanager
from celery import Celery
from Flask_celery import make_celery
from celery.result import AsyncResult
from celery.utils.log import get_task_logger
from celery.five import monotonic
from flask_pymongo import PyMongo
from hashlib import md5
import pymongo
import time


app = Flask(__name__)

cache = Cache(app, config={'CACHE_TYPE': 'simple'})
app.config['SECRET_KEY']= 'super secret key for me123456789987654321'

######################
# MONGODB SETUP
#####################
app.config['MONGO_HOST'] = 'localhost'
app.config['MONGO_DBNAME'] = 'celery-test-db'
app.config["MONGO_URI"] = 'mongodb://localhost:27017/celery-test-db'


mongo = PyMongo(app)


##############################
# CELERY ARGUMENTS
##############################


app.config['CELERY_BROKER_URL'] = 'amqp://localhost//'
app.config['CELERY_RESULT_BACKEND'] = 'mongodb://localhost:27017/celery-test-db'

app.config['CELERY_RESULT_BACKEND'] = 'mongodb'
app.config['CELERY_MONGODB_BACKEND_SETTINGS'] = {
    "host": "localhost",
    "port": 27017,
    "database": "celery-test-db", 
    "taskmeta_collection": "celery_jobs",
}

app.config['CELERY_TASK_SERIALIZER'] = 'json'


celery = Celery('task',broker='mongodb://localhost:27017/jobs')
celery = make_celery(app)


LOCK_EXPIRE = 60 * 2  # Lock expires in 2 minutes


@contextmanager
def memcache_lock(lock_id, oid):
    timeout_at = monotonic() + LOCK_EXPIRE - 3
    # cache.add fails if the key already exists
    status = cache.add(lock_id, oid, LOCK_EXPIRE)
    try:
        yield status
    finally:
        # memcache delete is very slow, but we have to use it to take
        # advantage of using add() for atomic locking
        if monotonic() < timeout_at and status:
            # don't release the lock if we exceeded the timeout
            # to lessen the chance of releasing an expired lock
            # owned by someone else
            # also don't release the lock if we didn't acquire it
            cache.delete(lock_id)



@celery.task(bind=True, name='app.myTask1')
def myTask1(self):

    self.update_state(state='IN TASK')

    lock_id = self.name

    with memcache_lock(lock_id, self.app.oid) as acquired:
        if acquired:
            # do work if we got the lock
            print('acquired is {}'.format(acquired))
            self.update_state(state='DOING WORK')
            time.sleep(90)
            return 'result'

    # otherwise, the lock was already in use
    raise self.retry(countdown=60)  # redeliver message to the queue, so the work can be done later



@celery.task(bind=True, name='app.myTask2')
def myTask2(self):
    print('you are in task2')
    self.update_state(state='STARTING')
    time.sleep(120)
    print('task2 done')


@app.route('/', methods=['GET', 'POST'])
def index():

    return render_template('index.html')

@app.route('/task1', methods=['GET', 'POST'])
def task1():

    print('running task1')
    result = myTask1.delay()

    # get async task id
    taskResult = AsyncResult(result.task_id)


    # push async taskid into db collection job_task_id
    mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'task1'})

    return render_template('task1.html')


@app.route('/task2', methods=['GET', 'POST'])
def task2():

    print('running task2')
    result = myTask2.delay()

    # get async task id
    taskResult = AsyncResult(result.task_id)

    # push async taskid into db collection job_task_id
    mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'task2'})

    return render_template('task2.html') 


@app.route('/status', methods=['GET', 'POST'])
def status():

    taskid_list = []
    task_state_list = []
    TaskName_list = []

    allAsyncData = mongo.db.job_task_id.find()

    for doc in allAsyncData:
        try:
            taskid_list.append(doc['taskid'])
        except:
            print('error with db conneciton in asyncJobStatus')

        TaskName_list.append(doc['TaskName'])

    # PASS TASK ID TO ASYNC RESULT TO GET TASK RESULT FOR THAT SPECIFIC TASK
    for item in taskid_list:
        try:
            task_state_list.append(myTask1.AsyncResult(item).state)
        except:
            task_state_list.append('UNKNOWN')

    return render_template('status.html', data_list=zip(task_state_list, TaskName_list))

Final Working Code最终工作代码

from flask import (Flask, render_template, flash, redirect,
                   url_for, session, logging, request, g, render_template_string, jsonify)
from flask_caching import Cache
from contextlib import contextmanager
from celery import Celery
from Flask_celery import make_celery
from celery.result import AsyncResult
from celery.utils.log import get_task_logger
from celery.five import monotonic
from flask_pymongo import PyMongo
from hashlib import md5
import pymongo
import time
import redis
from flask_redis import FlaskRedis


app = Flask(__name__)

# ADDING REDIS
redis_store = FlaskRedis(app)

# POINTING CACHE_TYPE TO REDIS
cache = Cache(app, config={'CACHE_TYPE': 'redis'})
app.config['SECRET_KEY']= 'super secret key for me123456789987654321'

######################
# MONGODB SETUP
#####################
app.config['MONGO_HOST'] = 'localhost'
app.config['MONGO_DBNAME'] = 'celery-test-db'
app.config["MONGO_URI"] = 'mongodb://localhost:27017/celery-test-db'


mongo = PyMongo(app)


##############################
# CELERY ARGUMENTS
##############################

# CELERY USING REDIS
app.config['CELERY_BROKER_URL'] = 'redis://localhost:6379/0'
app.config['CELERY_RESULT_BACKEND'] = 'mongodb://localhost:27017/celery-test-db'

app.config['CELERY_RESULT_BACKEND'] = 'mongodb'
app.config['CELERY_MONGODB_BACKEND_SETTINGS'] = {
    "host": "localhost",
    "port": 27017,
    "database": "celery-test-db", 
    "taskmeta_collection": "celery_jobs",
}

app.config['CELERY_TASK_SERIALIZER'] = 'json'


celery = Celery('task',broker='mongodb://localhost:27017/jobs')
celery = make_celery(app)


LOCK_EXPIRE = 60 * 2  # Lock expires in 2 minutes


@contextmanager
def memcache_lock(lock_id, oid):
    timeout_at = monotonic() + LOCK_EXPIRE - 3
    print('in memcache_lock and timeout_at is {}'.format(timeout_at))
    # cache.add fails if the key already exists
    status = cache.add(lock_id, oid, LOCK_EXPIRE)
    try:
        yield status
        print('memcache_lock and status is {}'.format(status))
    finally:
        # memcache delete is very slow, but we have to use it to take
        # advantage of using add() for atomic locking
        if monotonic() < timeout_at and status:
            # don't release the lock if we exceeded the timeout
            # to lessen the chance of releasing an expired lock
            # owned by someone else
            # also don't release the lock if we didn't acquire it
            cache.delete(lock_id)



@celery.task(bind=True, name='app.myTask1')
def myTask1(self):

    self.update_state(state='IN TASK')
    print('dir is {} '.format(dir(self)))

    lock_id = self.name
    print('lock_id is {}'.format(lock_id))

    with memcache_lock(lock_id, self.app.oid) as acquired:
        print('in memcache_lock and lock_id is {} self.app.oid is {} and acquired is {}'.format(lock_id, self.app.oid, acquired))
        if acquired:
            # do work if we got the lock
            print('acquired is {}'.format(acquired))
            self.update_state(state='DOING WORK')
            time.sleep(90)
            return 'result'

    # otherwise, the lock was already in use
    raise self.retry(countdown=60)  # redeliver message to the queue, so the work can be done later



@celery.task(bind=True, name='app.myTask2')
def myTask2(self):
    print('you are in task2')
    self.update_state(state='STARTING')
    time.sleep(120)
    print('task2 done')


@app.route('/', methods=['GET', 'POST'])
def index():

    return render_template('index.html')

@app.route('/task1', methods=['GET', 'POST'])
def task1():

    print('running task1')
    result = myTask1.delay()

    # get async task id
    taskResult = AsyncResult(result.task_id)


    # push async taskid into db collection job_task_id
    mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'myTask1'})

    return render_template('task1.html')


@app.route('/task2', methods=['GET', 'POST'])
def task2():

    print('running task2')
    result = myTask2.delay()

    # get async task id
    taskResult = AsyncResult(result.task_id)

    # push async taskid into db collection job_task_id
    mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'task2'})

    return render_template('task2.html')

@app.route('/status', methods=['GET', 'POST'])
def status():

    taskid_list = []
    task_state_list = []
    TaskName_list = []

    allAsyncData = mongo.db.job_task_id.find()

    for doc in allAsyncData:
        try:
            taskid_list.append(doc['taskid'])
        except:
            print('error with db conneciton in asyncJobStatus')

        TaskName_list.append(doc['TaskName'])

    # PASS TASK ID TO ASYNC RESULT TO GET TASK RESULT FOR THAT SPECIFIC TASK
    for item in taskid_list:
        try:
            task_state_list.append(myTask1.AsyncResult(item).state)
        except:
            task_state_list.append('UNKNOWN')

    return render_template('status.html', data_list=zip(task_state_list, TaskName_list))


if __name__ == '__main__':
    app.secret_key = 'super secret key for me123456789987654321'
    app.run(port=1234, host='localhost')

Here is also a screen shot you can see that I ran myTask1 two times and myTask2 a single time.这也是一个屏幕截图，您可以看到我运行了两次myTask1 ，一次运行了 myTask2。 Now I have the expected behavior for myTask1.现在我有了 myTask1 的预期行为。 Now myTask1 will be run by a single worker if another worker attempt to pick it up it will just keep retrying based on whatever i define.现在myTask1将由一个工人运行，如果另一个工人试图拿起它，它将根据我定义的任何内容继续重试。

Answer 1

In your question, you point out this warning from the Celery example you used:在您的问题中，您从您使用的 Celery 示例中指出了这个警告：

In order for this to work correctly you need to be using a cache backend where the .add operation is atomic.为了使其正常工作，您需要使用.add操作是原子的缓存后端。 memcached is known to work well for this purpose.众所周知， memcached可以很好地用于此目的。

And you mention that you don't really understand what this means.你提到你并不真正理解这意味着什么。 Indeed, the code you show demonstrates that you've not heeded that warning, because your code uses an inappropriate backend.实际上，您显示的代码表明您没有注意该警告，因为您的代码使用了不合适的后端。

Consider this code:考虑这段代码：

with memcache_lock(lock_id, self.app.oid) as acquired:
    if acquired:
        # do some work

What you want here is for acquired to be true only for one thread at a time.您在这里想要的是acquired一次仅对一个线程为真。 If two threads enter the with block at the same time, only one should "win" and have acquired be true.如果两个线程同时进入with块，则只有一个线程应该“获胜”并且 have acquired为真。 This thread that has acquired true can then proceed with its work, and the other thread has to skip doing the work and try again later to acquire the lock.这个已经acquired true 的线程然后可以继续它的工作，而另一个线程必须跳过执行工作并稍后再次尝试获取锁。 In order to ensure that only one thread can have acquired true, .add must be atomic.为了确保只有一个线程.add acquired是原子的。

Here's some pseudo code of what .add(key, value) does:这是.add(key, value)的一些伪代码：

1. if <key> is already in the cache:
2.   return False    
3. else:
4.   set the cache so that <key> has the value <value>
5.   return True

If the execution of .add is not atomic, this could happen if two threads A and B execute .add("foo", "bar") .如果.add的执行不是原子的，那么如果两个线程 A 和 B 执行.add("foo", "bar")就可能发生这种情况。 Assume an empty cache at the start.假设一开始有一个空缓存。

Thread A executes 1. if "foo" is already in the cache and finds that "foo" is not in the cache, and jumps to line 3 but the thread scheduler switches control to thread B.线程A执行1. if "foo" is already in the cache ，发现"foo"不在缓存中，跳转到第3行但是线程调度器将控制切换到线程B。
Thread B also executes 1. if "foo" is already in the cache , and also finds that "foo" is not in the cache.线程B也执行1. if "foo" is already in the cache中，同样发现“foo”不在缓存中。 So it jumps to line 3 and then executes line 4 and 5 which sets the key "foo" to the value "bar" and the call returns True .因此它跳转到第 3 行，然后执行第 4 行和第 5 行，将键"foo"设置为值"bar" ，调用返回True 。
Eventually, the scheduler gives control back to Thread A, which continues executing 3, 4, 5 and also sets the key "foo" to the value "bar" and also returns True .最终，调度程序将控制权交还给线程 A，该线程 A 继续执行 3、4、5，并将键"foo"设置为值"bar"并返回True 。

What you have here is two .add calls that return True , if these .add calls are made within memcache_lock this entails that two threads can have acquired be true.您在这里有两个返回True的.add调用，如果这些.add调用是在memcache_lock中进行的，则这意味着两个线程可能已acquired true。 So two threads could do work at the same time, and your memcache_lock is not doing what it should be doing, which is only allow one thread to work at a time.所以两个线程可以同时工作，而你的memcache_lock没有做它应该做的，一次只允许一个线程工作。

You are not using a cache that ensures that .add is atomic .您没有使用确保.add是原子的缓存。 You initialize it like this:你像这样初始化它：

cache = Cache(app, config={'CACHE_TYPE': 'simple'})

The simple backend is scoped to a single process, has no thread-safety, and has an .add operation which is not atomic. simple后端仅限于单个进程，没有线程安全性，并且有一个非原子的.add操作。 (This does not involve Mongo at all by the way. If you wanted your cache to be backed by Mongo, you'd have to specify a backed specifically made to send data to a Mongo database.) （顺便说一句，这根本不涉及 Mongo。如果您希望缓存由 Mongo 支持，则必须指定一个专门用于将数据发送到 Mongo 数据库的支持。）

So you have to switch to another backend, one that guarantees that .add is atomic.所以你必须切换到另一个后端，一个保证.add是原子的。 You could follow the lead of the Celery example and use the memcached backend , which does have an atomic .add operation.您可以效仿 Celery 示例并使用memcached后端，它确实具有原子.add操作。 I don't use Flask, but I've does essentially what you are doing with Django and Celery, and used the Redis backend successfully to provide the kind of locking you're using here.我不使用 Flask，但我基本上做了您使用 Django 和 Celery 所做的事情，并成功地使用了 Redis 后端来提供您在这里使用的那种锁定。

Answer 2

With this setup, you should still expect to see workers receiving the task, since the lock is checked inside of the task itself.使用此设置，您仍然应该期望看到工作人员接收任务，因为锁是在任务本身内部检查的。 The only difference will be that the work won't be performed if the lock is acquired by another worker.唯一的区别是如果锁被另一个 worker 获取，则不会执行该工作。
In the example given in the docs, this is the desired behavior;在文档中给出的示例中，这是所需的行为； if a lock already exists, the task will simply do nothing and finish as successful.如果锁已经存在，任务将什么都不做并成功完成。 What you want is slightly different;您想要的略有不同； you want the work to be queued up instead of ignored.您希望工作排队而不是被忽略。

In order to get the desired effect, you would need to make sure that the task will be picked up by a worker and performed some time in the future.为了获得预期的效果，您需要确保该任务将由工作人员接手并在未来某个时间执行。 One way to accomplish this would be with retrying.实现此目的的一种方法是重试。

@task(bind=True, name='my-task')
def my_task(self):
    lock_id = self.name

    with memcache_lock(lock_id, self.app.oid) as acquired:
        if acquired:
            # do work if we got the lock
            print('acquired is {}'.format(acquired))
            return 'result'

    # otherwise, the lock was already in use
    raise self.retry(countdown=60)  # redeliver message to the queue, so the work can be done later

Answer 3

I also found this to be a surprisingly hard problem.我还发现这是一个非常困难的问题。 Inspired mainly by Sebastian's work on implementing a distributed locking algorithm in redis I wrote up a decorator function .主要受到塞巴斯蒂安在 redis 中实现分布式锁定算法的工作的启发，我编写了一个装饰器函数。

A key point to bear in mind about this approach is that we lock tasks at the level of the task's argument space, eg we allow multiple game update/process order tasks to run concurrently, but only one per game.关于这种方法要牢记的一个关键点是我们将任务锁定在任务参数空间的级别，例如，我们允许多个游戏更新/处理顺序任务同时运行，但每个游戏只能运行一个。 That's what argument_signature achieves in the code below.这就是argument_signature在下面的代码中实现的。 You can see documentation on how we use this in our stack at this gist :您可以在此要点处查看有关我们如何在堆栈中使用它的文档：

import base64
from contextlib import contextmanager
import json
import pickle as pkl
import uuid

from backend.config import Config
from redis import StrictRedis
from redis_cache import RedisCache
from redlock import Redlock

rds = StrictRedis(Config.REDIS_HOST, decode_responses=True, charset="utf-8")
rds_cache = StrictRedis(Config.REDIS_HOST, decode_responses=False, charset="utf-8")
redis_cache = RedisCache(redis_client=rds_cache, prefix="rc", serializer=pkl.dumps, deserializer=pkl.loads)
dlm = Redlock([{"host": Config.REDIS_HOST}])

TASK_LOCK_MSG = "Task execution skipped -- another task already has the lock"
DEFAULT_ASSET_EXPIRATION = 8 * 24 * 60 * 60  # by default keep cached values around for 8 days
DEFAULT_CACHE_EXPIRATION = 1 * 24 * 60 * 60  # we can keep cached values around for a shorter period of time

REMOVE_ONLY_IF_OWNER_SCRIPT = """
if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end
"""


@contextmanager
def redis_lock(lock_name, expires=60):
    # https://breadcrumbscollector.tech/what-is-celery-beat-and-how-to-use-it-part-2-patterns-and-caveats/
    random_value = str(uuid.uuid4())
    lock_acquired = bool(
        rds.set(lock_name, random_value, ex=expires, nx=True)
    )
    yield lock_acquired
    if lock_acquired:
        rds.eval(REMOVE_ONLY_IF_OWNER_SCRIPT, 1, lock_name, random_value)


def argument_signature(*args, **kwargs):
    arg_list = [str(x) for x in args]
    kwarg_list = [f"{str(k)}:{str(v)}" for k, v in kwargs.items()]
    return base64.b64encode(f"{'_'.join(arg_list)}-{'_'.join(kwarg_list)}".encode()).decode()


def task_lock(func=None, main_key="", timeout=None):
    def _dec(run_func):
        def _caller(*args, **kwargs):
            with redis_lock(f"{main_key}_{argument_signature(*args, **kwargs)}", timeout) as acquired:
                if not acquired:
                    return TASK_LOCK_MSG
                return run_func(*args, **kwargs)
        return _caller
    return _dec(func) if func is not None else _dec

Implementation in our task definitions file:在我们的任务定义文件中实现：

@celery.task(name="async_test_task_lock")
@task_lock(main_key="async_test_task_lock", timeout=UPDATE_GAME_DATA_TIMEOUT)
def async_test_task_lock(game_id):
    print(f"processing game_id {game_id}")
    time.sleep(TASK_LOCK_TEST_SLEEP)

How we test against a local celery cluster:我们如何针对本地 celery 集群进行测试：

from backend.tasks.definitions import async_test_task_lock, TASK_LOCK_TEST_SLEEP
from backend.tasks.redis_handlers import rds, TASK_LOCK_MSG
class TestTaskLocking(TestCase):
    def test_task_locking(self):
        rds.flushall()
        res1 = async_test_task_lock.delay(3)
        res2 = async_test_task_lock.delay(5)
        self.assertFalse(res1.ready())
        self.assertFalse(res2.ready())
        res3 = async_test_task_lock.delay(5)
        res4 = async_test_task_lock.delay(5)
        self.assertEqual(res3.get(), TASK_LOCK_MSG)
        self.assertEqual(res4.get(), TASK_LOCK_MSG)
        time.sleep(TASK_LOCK_TEST_SLEEP)
        res5 = async_test_task_lock.delay(3)
        self.assertFalse(res5.ready())

(as a goodie there's also a quick example of how to setup a redis_cache ) （作为好东西，还有一个如何设置redis_cache的快速示例）

Flask Celery 任务锁定

问题描述

Doc Notes:文档注释：

Final Working Code最终工作代码

3 个解决方案

解决方案1
7 2018-12-29 23:54:30

解决方案2
1 2018-12-28 20:15:03

解决方案3
1 已采纳 2020-08-09 02:17:28

Flask Celery 任务锁定

问题描述

Doc Notes:文档注释：

Final Working Code最终工作代码

3 个解决方案

解决方案1 7 2018-12-29 23:54:30

解决方案2 1 2018-12-28 20:15:03

解决方案3 1 已采纳 2020-08-09 02:17:28

解决方案1
7 2018-12-29 23:54:30

解决方案2
1 2018-12-28 20:15:03

解决方案3
1 已采纳 2020-08-09 02:17:28