[英]Celery shutdown worker and retry the task
I need to implement the following logic in celery task: if some condition is met, shutdown the current worker and retry the task.我需要在 celery 任务中实现以下逻辑:如果满足某些条件,则关闭当前工作人员并重试该任务。
Tested on the sample task:在示例任务上测试:
@app.task(bind=True, max_retries=1)
def shutdown_and_retry(self, config):
try:
raise Exception('test exection')
except Exception as exc:
print('Retry {}/{}, task id {}'.format(self.request.retries, self.max_retries, self.request.id))
app.control.shutdown(destination=[self.request.hostname]) # send shutdown signal to the current worker
raise self.retry(exc=exc, countdown=5)
print('Execute task id={} retries={}'.format(self.request.id, self.request.retries))
return 'some result'
But it give strange results, steps:但它给出了奇怪的结果,步骤:
celery worker -Q test_queue -A test_worker -E -c 1 -n test_worker_1
.celery worker -Q test_queue -A test_worker -E -c 1 -n test_worker_1
。 What I have tried:我试过的:
task_reject_on_worker_lost = True
in celeryconfig.py and run the same task.task_reject_on_worker_lost = True
并运行相同的任务。 Result: nothing changed.app.control.revoke(self.request.id)
before shutdown and retry calls in the worker (based on this ).app.control.revoke(self.request.id)
并在 worker 中重试调用(基于this )。 Result: after the first try got the same (2 tasks in queue), but when I run second worker queue flushed and it didn't run anything. Is there a way to do not push back the original task to queue during app.control.shutdown()
call?有没有办法在
app.control.shutdown()
调用期间不将原始任务推回队列? It seems that this is the root cause.看来这才是根本原因。 Or could you please suggest another workaround which will allow to implement the right logic pointed above.
或者您能否建议另一种解决方法,该方法可以实现上面指出的正确逻辑。
Setup: RabbitMQ 3.8.2, celery 4.1.0, python 3.5.4设置:RabbitMQ 3.8.2、celery 4.1.0、python 3.5.4
Settings in celeryconfig.py: celeryconfig.py 中的设置:
task_acks_late = True
task_acks_on_failure_or_timeout = True
task_reject_on_worker_lost = False
task_track_started = True
worker_prefetch_multiplier = 1
worker_disable_rate_limits = True
It looks like the issue is task_acks_late
in your configuration file.看起来问题是配置文件中的
task_acks_late
。 By using that, you are saying "Only remove the task from the queue when I have finished running".通过使用它,您说的是“仅在我完成运行后才从队列中删除任务”。 You then kill the worker, so it is never acknowledged (and you get duplicates of the task).
然后你杀了工人,所以它永远不会被确认(你会得到任务的副本)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.