简体   繁体   中英

Size limit on Celery task arguments?

We have a Celery task that requires a Pandas dataframe as an input. The dataframe is first serialized to JSON and then passed as an argument into the task. The dataframes can have around 35 thousand entries, which results in a JSON dictionary occupying about 700kB. We are using Redis as a broker.

Unfortunately the call to delay() on this task often takes too long (in excess of thirty seconds), and our web requests time out.

Is this the kind of scale that Redis and Celery should be able to handle? I presumed it was well within limits and the problem lies elsewhere, but I can't find any guidance or experience on the internet.

I would suggest to save the json into your database and pass the id to the celery task instead of the whole json.

class TodoTasks(models.Model):
    serialized_json = models.TextField()

Moreover, you can keep record of the status of the task with a few fields and even keep error (which I find very usefull for debugging):

import traceback
from django.db import models

class TodoTasks(models.Model):
    class StatusChoices(models.TextChoices):
        PENDING = "PENDING", "Awaiting celery to process the task"
        SUCCESS = "SUCCESS", "Task done with success"
        FAILED = "FAILED", "Task failed to be processed"

    serialized_json = models.TextField()

    status = models.CharField(
        max_length=10, choices=StatusChoices.choices, default=StatusChoices.PENDING
    )
    created_date = models.DateTimeField(auto_now_add=True)
    processed_date = models.DateTimeField(null=True, blank=True)
    error = models.TextField(null=True, blank=True)

    def handle_exception(self):
        self.error = traceback.format_exc()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM