简体   繁体   中英

Serial processing of specific tasks using Celery with concurrency

I have a python/celery setup: I have a queue named "task_queue" and multiple python scripts that feed it data from different sensors. There is a celery worker that reads from that queue and sends an alarm to user if the sensor value changed from high to low. The worker has multiple threads (I have autoscaling parameter enabled) and everything works fine until one sensor decides to send multiple messages at once. That's when I get the race condition and may send multiple alarms to user, since before a thread stores the info that it had already sent an alarm, few other threads also send it.

I have n sensors (n can be more than 10000) and messages from any sensor should be processed sequentially. So in theory I could have n threads, but that would be an overkill. I'm looking for a simplest way to equally distribute the messages across x threads (usually 10 or 20), so I wouldn't have to (re)write routing function and define new queues each time I want to increase x (or decrease).

So is it possible to somehow mark the tasks that originate from same sensor to be executed in serial manner (when calling the delay or apply_async)? Or is there a different queue/worker architecture I should be using to achieve that?

From what I understand, you have some tasks that can run all at the same time and a specific task that can not do this (this task needs to be executed 1 at a time).

There is no way (for now) to set the concurrency of a specific task queue so I think the best approach in your situation would be handling the problem with multiple workers.

Lets say you have the following queues:

  • queue_1 Here we send tasks that can run all at the same time
  • queue_2 Here we send tasks that can run 1 at a time.

You could start celery with the following commands (If you want them in the same machine).

celery -A proj worker --loglevel=INFO --concurrency=10 -n worker1@%h -Q queue_1
celery -A proj worker --loglevel=INFO --concurrency=1 -n worker2@%h -Q queue_2

This will make worker1 which has concurrency 10 handle all tasks that can be ran at the same time and worker2 handles only the tasks that need to be 1 at a time.

Here is some documentation reference:https://docs.celeryproject.org/en/stable/userguide/workers.html

NOTE: Here you will need to specify the task in which queue runs. This can be done when calling with apply_async , directly from the decorator or some other ways.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM