简体   繁体   中英

Celery not executing new tasks if redis lost connection is restablished

I have a Celery worker configured to connect to redis as follows:

 celery_app_site24x7 = Celery('monitoringExterne.celerys.site24x7',
                             broker=settings.REDIS['broker'], backend=settings.REDIS['backend'])

celery_app_site24x7.conf.broker_transport_options = {
    'visibility_timeout': 36000
}

celery_app_site24x7.conf.socket_timeout = 300
celery_app_site24x7.conf.broker_connection_max_retries = None
celery_app_site24x7.config_from_object('django.conf:settings')
celery_app_site24x7.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

The issue is that if Redis is down and the connection is restablished new tasks added in the queue are not executed:

[2020-01-13 10:10:14,517: ERROR/MainProcess] consumer: Cannot connect to redis://xxx.xxx.xxx.xx:6380/10: Error while reading from socket: ('Connection closed by server.',).
Trying again in 2.00 seconds...

[2020-01-13 10:10:16,590: INFO/MainProcess] Connected to redis://xxx.xxx.xxx.xx:6380/10
[2020-01-13 10:10:16,699: INFO/MainProcess] mingle: searching for neighbors
[2020-01-13 10:10:17,766: INFO/MainProcess] mingle: all alone

I have manually called a celery task through the django shell as follows:

celery_tasks.site24x7.test.delay()

It returns me the Async task ID but the worker doesnot process this task.

<AsyncResult:ff634b85-edb5-44d4-bdb1-17a220761fcc>

If I continue to launch the task as delay the queue keeps incrementing:

127.0.0.1:6379[10]> llen site24x7
(integer) 4
127.0.0.1:6379[10]> llen site24x7
(integer) 5

Below are the output of the celery status and inspect

$ celery -A monitoringExterne --app=monitoringExterne.celerys.site24x7

status Error: No nodes replied within time constraint.

$ celery -A monitoringExterne --app=monitoringExterne.celerys.site24x7 inspect active

Error: No nodes replied within time constraint.

If your workers are not subscribed to the site24x7 queue, then the number of tasks in that queue will keep increasing... Try to run the work with somwething like: celery -A monitoringExterne.celerys.site24x7 -Q site24x7 -l info

Also, keep in mind that -A and --app are the same flag, you should not use both.

If you are getting the No nodes replied within time constraint output, that means you have no Celery workers active in your cluster, which also could be the reason why the number of tasks is increasing in that queue - there are no workers to execute them!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM