简体   繁体   中英

task queue in celery

I have a service that processes data. It is written in Python (Django) and uses Celery for making it asynchronous.

Processing our data uses credits. You can also buy credits and this is triggered by a Stripe-webhook.

Each action that involves credit changes is listed as a "job". I have 2 Celery tasks all adding a job to a certain JobId database.

I use the "job" concept to keep track of which data is processed at which job.

models.py:

class JobId(models.Model):
    user = models.ForeignKey(User, blank=True, null=True, default=None)
    job_time = models.DateTimeField(auto_now_add=True)
    # current credit level
    credits = models.IntegerField(default=0, null=True, blank=True)
    # credit impact / delta of this job
    credit_delta = models.IntegerField(default=0, null=True, blank=True)

tasks.py:

task_1_buy_credits(user):
    # credit level of user is searched in jobs_database (the users last job)
    # adds one line in the JobId database with the users new credit balance


task_2_use_credits(user,data):
    # upfront unknown amount of data get processed
    # credit level of user is searched in jobs_database (the users last job)
    # decide to process job completely or stop if credit level is insufficient

My current issue is that when people start multiple jobs at a time, the previous job is not finished yet. As my final credit balance is not known yet I set it to zero to prevent new jobs from starting for now, while there might be credits left to do the job.

A similar situation happens when credit levels are increased when a job is being processed at the same time.

Basically, I need a kind of solution that allows to only run tasks in the same order they were created and after the previous one is finished.

OR

I need to have a real-time "user related credit level check" function that works across running tasks that are not finished yet.

I can not run this synchronous on my Django environment as my timeout is 30 seconds due to the fact that this is a web application hosted on heroku.

This is a difficult problem because Celery tasks are designed up front to be independent of everything else. They're just concerned with the information you give them, and don't care about the order of the job being processed. There are a few ways you can get around this, by using groups and chords , but I don't see how they would fit your needs.

Up front, I would add a task_id CharField to JobId model. When you start a task, you can store the returning task ID in the db for that JobId . Therefore, for a given user ID, you can check the status of the jobs for that user and return the most recent credit state if there are still jobs pending.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM