外部 API RabbitMQ 和 Celery 速率限制

Question

我正在使用外部REST API，它將我的 API 請求限制為 1 CPS。

這是以下架構：

版本：

燒瓶
RabbitMQ 3.6.4
AMPQ 1.4.9
海帶 3.0.35
芹菜 3.1.23
蟒蛇 2.7

API 客戶端向內部 API 發送 Web 請求，API 處理請求並控制發送到 RabbitMQ 的速率。 這些任務可能需要 5 秒到 120 秒，在某些情況下，任務可能會排隊並以比定義的更高的速率發送到外部 API，從而導致大量失敗的請求。 （導致大約 5% 的失敗請求）

可能的解決方案：

增加外部 API 限制
添加更多工人
跟蹤失敗的任務並稍后重試

盡管這些解決方案可能有效，但並不能完全解決我的速率限制器的實現並控制我的工作人員處理 API 請求的實際速率。 后來我真的需要控制外部速率。

我相信如果我可以控制可以將消息發送給工作人員的 RabbitMQ 速率限制，這可能是一個更好的選擇。 我找到了 rabbitmq 預取選項，但不確定是否有人可以推薦其他選項來控制向消費者發送消息的速率？

Answer 1

您將需要創建自己的速率限制器，因為 Celery 的速率限制僅適用於每個工人，並且“不會像您期望的那樣工作”。

我個人發現嘗試從另一個任務添加新任務時它完全中斷。

我認為限速的需求范圍太廣，取決於應用程序本身，所以 Celery 的實現故意過於簡單。

這是我使用Celery + Django + Redis創建的示例。 基本上它為您的App.Task類添加了一個方便的方法，它將跟蹤您在Redis中的任務執行率。 如果它太高，任務將在稍后Retry 。

本示例以發送 SMTP 消息為例，但可以輕松替換為 API 調用。

該算法的靈感來自 Figma https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/

https://gist.github.com/Vigrond/2bbea9be6413415e5479998e79a1b11a

# Rate limiting with Celery + Django + Redis
# Multiple Fixed Windows Algorithm inspired by Figma https://www.figma.com/blog/an-alternative-approach-to-rate-limiting/
#   and Celery's sometimes ambiguous, vague, and one-paragraph documentation
#
# Celery's Task is subclassed and the is_rate_okay function is added


# celery.py or however your App is implemented in Django
import os
import math
import time

from celery import Celery, Task
from django_redis import get_redis_connection
from django.conf import settings
from django.utils import timezone


app = Celery('your_app')

# Get Redis connection from our Django 'default' cache setting
redis_conn = get_redis_connection("default")

# We subclass the Celery Task
class YourAppTask(Task):
  def is_rate_okay(self, times=30, per=60):
    """
      Checks to see if this task is hitting our defined rate limit too much.
      This example sets a rate limit of 30/minute.

      times (int): The "30" in "30 times per 60 seconds".
      per (int):  The "60" in "30 times per 60 seconds".

      The Redis structure we create is a Hash of timestamp keys with counter values
      {
        '1560649027.515933': '2',  // unlikely to have more than 1
        '1560649352.462433': '1',
      }

      The Redis key is expired after the amount of 'per' has elapsed.
      The algorithm totals the counters and checks against 'limit'.

      This algorithm currently does not implement the "leniency" described 
      at the bottom of the figma article referenced at the top of this code.
      This is left up to you and depends on application.

      Returns True if under the limit, otherwise False.
    """

    # Get a timestamp accurate to the microsecond
    timestamp = timezone.now().timestamp()

    # Set our Redis key to our task name
    key = f"rate:{self.name}"

    # Create a pipeline to execute redis code atomically
    pipe = redis_conn.pipeline()

    # Increment our current task hit in the Redis hash
    pipe.hincrby(key, timestamp)

    # Grab the current expiration of our task key
    pipe.ttl(key)

    # Grab all of our task hits in our current frame (of 60 seconds)
    pipe.hvals(key)

    # This returns a list of our command results.  [current task hits, expiration, list of all task hits,]
    result = pipe.execute()

    # If our expiration is not set, set it.  This is not part of the atomicity of the pipeline above.
    if result[1] < 0:
        redis_conn.expire(key, per)

    # We must convert byte to int before adding up the counters and comparing to our limit
    if sum([int(count) for count in result[2]]) <= times:
        return True
    else:
        return False


app.Task = YourAppTask
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()

...

# SMTP Example
import random
from YourApp.celery import app
from django.core.mail import EmailMessage

# We set infinite max_retries so backlogged email tasks do not disappear
@app.task(name='smtp.send-email', max_retries=None, bind=True)
def send_email(self, to_address):

    if not self.is_rate_okay():
        # We implement a random countdown between 30 and 60 seconds 
        #   so tasks don't come flooding back at the same time
        raise self.retry(countdown=random.randint(30, 60))

    message = EmailMessage(
        'Hello',
        'Body goes here',
        'from@yourdomain.com',
        [to_address],
    )
    message.send()

外部 API RabbitMQ 和 Celery 速率限制

問題描述

1 個解決方案

解決方案1
3 已采納 2019-06-16 02:10:02

外部 API RabbitMQ 和 Celery 速率限制

問題描述

1 個解決方案

解決方案1 3 已采納 2019-06-16 02:10:02

解決方案1
3 已采納 2019-06-16 02:10:02