简体   繁体   English

Celery + Python:将耗时的任务排队在另一个任务中

[英]Celery + Python: Queue time consuming tasks within another task

I want to query an api (which is time consuming) with lots of items (~100) but not all at once. 我想查询包含很多项目(〜100)但不是一次全部的api(这很耗时)。 Instead I want a little delay between the queries. 相反,我希望查询之间有一些延迟。

What I currently have is a task that gets executed asynchronously and iterates over the queries and after each iteration waits some time: 我目前拥有的任务是异步执行的,并遍历查询,每次迭代后都等待一段时间:

@shared_task
def query_api_multiple(values):
    delay_between_queries = 1

    query_results = []

    for value in values:
        time.sleep(delay_between_queries)

        response = query_api(value)
        if response['result']:
            query_results.append(response)

    return query_results

My question is, when multiple of those requests come in, will the second request gets executed after the first is finished or while the first is still running? 我的问题是,当多个请求进入时,第二个请求会在第一个请求完成后还是在第一个仍在运行时执行? And when they are not getting executed at the same time, how can I achieve this? 当它们没有同时执行时,我该如何实现呢?

You should not use time.sleep but rate limit your task instead: 您不应该使用time.sleep而是对任务进行速率限制:

Task.rate_limit

Set the rate limit for this task type (limits the number of tasks that can be run in a given time frame). 设置此任务类型的速率限制(限制在给定时间范围内可以运行的任务数量)。

The rate limits can be specified in seconds, minutes or hours by appending “/s”, “/m” or “/h” to the value. 可以通过在值后添加“ / s”,“ / m”或“ / h”来指定速率限制,以秒,分钟或小时为单位。 Tasks will be evenly distributed over the specified time frame. 任务将在指定的时间段内平均分配。

Example: “100/m” (hundred tasks a minute). 示例:“ 100 / m”(每分钟一百个任务)。 This will enforce a minimum delay of 600ms between starting two tasks on the same worker instance. 这将在同一工作程序实例上启动两个任务之间强制执行至少600ms的延迟。

So if you want to limit it to 1 query per second, try this: 因此,如果您希望将其限制为每秒1个查询,请尝试以下操作:

@shared_task(rate_limit='1/s')
def query_api_multiple(values):
    ...

Yes, if you create multiple tasks then they may run at the same time. 是的,如果您创建多个任务,那么它们可能会同时运行。

You can rate limit on a task type basis with celery if you want to limit the number of tasks that run per period of time. 如果要限制每个时间段运行的任务数,可以使用芹菜对任务类型进行限制。 Alternatively, you could implement a rate limiting pattern using something like redis, combined with celery retries, if you need more flexibility than what celery provides OOtB. 或者,如果您需要比芹菜提供的OOtB更多的灵活性,则可以使用redis之类的东西结合芹菜重试来实现限速模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM