简体繁体 English

设置工作队列，使工作人员能够被杀死并重新排队

[英]Setup for a Job queue that enable workers to be killed and requeued

原文 2020-09-18 14:10:39 2 1 python/ amazon-elastic-beanstalk/ celery

I am looking for a way to setup a job queueing system in python, like RQ or Celery.我正在寻找一种在 python 中设置作业排队系统的方法，如 RQ 或 Celery。 I am using elastic beanstalk for deployment.我正在使用弹性 beantalk 进行部署。

I am currently using RQ, but i am facing the following problem: if the worker gets killed, the job is lost.我目前正在使用 RQ，但我面临以下问题：如果工人被杀，工作就会丢失。 I would like it to be automatically requeued.我希望它自动重新排队。 I have long running jobs (some can last for 1h).我有长时间运行的工作（有些可以持续 1 小时）。 Sometimes, we may need to reboot the server, or deploy a new version, without waiting for all jobs to finish.有时，我们可能需要重新启动服务器，或部署新版本，而无需等待所有作业完成。 Sometimes, some essential container fails and cause all the other containers to be rebooted.有时，某些重要容器发生故障并导致所有其他容器重新启动。

During reboot, Beanstalk automatically send the SIGINT signal to stop container, then after 10 seconds, send the SIGKILL signal.在重启过程中，Beanstalk 自动发送 SIGINT 信号停止容器，然后在 10 秒后发送 SIGKILL 信号。 I cannot find a way to have the jobs requeued in such an event.在这种情况下，我找不到让工作重新排队的方法。 I tried both RQ with Redis and Celery with a Redis Broker.我尝试了使用 Redis 的 RQ 和使用 Redis Broker 的 Celery。

Can someone recommand a setup for this?有人可以为此推荐一个设置吗？

1 个解决方案

One thing you can do with Celery is to enable the task-reject-on-worker-lost configuration setting.你可以用 Celery 做的一件事是启用task-reject-on-worker-lost配置设置。

Even if task_acks_late is enabled, the worker will acknowledge tasks when the worker process executing them abruptly exits or is signaled (eg, KILL/INT, etc).即使启用了 task_acks_late，当执行任务的工作进程突然退出或收到信号（例如，KILL/INT 等）时，工作人员也会确认任务。

Setting this to true allows the message to be re-queued instead, so that the task will execute again by the same worker, or another worker.将此设置为 true 允许消息重新排队，以便任务将由同一工作人员或其他工作人员再次执行。