如何使用AWS Elastic Beanstalk可扩展的Django app运行芹菜工作者？

Question

如何将Django与AWS Elastic Beanstalk一起使用，它也只能在主节点上通过芹菜运行任务？

Answer 1

This is how I set up celery with django on elastic beanstalk with scalability working fine. 这就是我在弹性豆茎上用django设置芹菜的方法，可扩展性很好。

Please keep in mind that 'leader_only' option for container_commands works only on environment rebuild or deployment of the App. 请记住，对于container_commands“leader_only”选项仅适用于环境重建或App的部署。 If service works long enough, leader node may be removed by Elastic Beanstalk. 如果服务工作时间足够长，则可以通过Elastic Beanstalk删除领导节点。 To deal with that, you may have to apply instance protection for your leader node. 要解决此问题，您可能必须为领导节点应用实例保护。 Check: http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html#instance-protection-instance 检查： http ： //docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html#instance-protection-instance

Add bash script for celery worker and beat configuration. 为芹菜工人添加bash脚本并节拍配置。

Add file root_folder/.ebextensions/files/celery_configuration.txt : 添加文件root_folder / .ebextensions / files / celery_configuration.txt ：

#!/usr/bin/env bash

# Get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}

# Create celery configuraiton script
celeryconf="[program:celeryd-worker]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A django_app --loglevel=INFO

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998

environment=$celeryenv

[program:celeryd-beat]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery beat -A django_app --loglevel=INFO --workdir=/tmp -S django --pidfile /tmp/celerybeat.pid

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-beat.log
stderr_logfile=/var/log/celery-beat.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998

environment=$celeryenv"

# Create the celery supervisord conf script
echo "$celeryconf" | tee /opt/python/etc/celery.conf

# Add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
  then
  echo "[include]" | tee -a /opt/python/etc/supervisord.conf
  echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
fi

# Reread the supervisord config
supervisorctl -c /opt/python/etc/supervisord.conf reread

# Update supervisord in cache without restarting all services
supervisorctl -c /opt/python/etc/supervisord.conf update

# Start/Restart celeryd through supervisord
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker

Take care about script execution during deployment, but only on main node (leader_only: true). 在部署期间注意脚本执行，但仅限于主节点（leader_only：true）。 Add file root_folder/.ebextensions/02-python.config : 添加文件root_folder / .ebextensions / 02-python.config ：

container_commands:
  04_celery_tasks:
    command: "cat .ebextensions/files/celery_configuration.txt > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
    leader_only: true
  05_celery_tasks_run:
    command: "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
    leader_only: true

Beat is configurable without need of redeployment, with separate django applications: https://pypi.python.org/pypi/django_celery_beat . Beat是可配置的，无需重新部署，具有单独的django应用程序： https ： //pypi.python.org/pypi/django_celery_beat 。
Storing task results is good idea to: https://pypi.python.org/pypi/django_celery_beat 存储任务结果是个好主意： https ： //pypi.python.org/pypi/django_celery_beat

File requirements.txt 文件requirements.txt

celery==4.0.0
django_celery_beat==1.0.1
django_celery_results==1.0.1
pycurl==7.43.0 --global-option="--with-nss"

Configure celery for Amazon SQS broker (Get your desired endpoint from list: http://docs.aws.amazon.com/general/latest/gr/rande.html ) root_folder/django_app/settings.py : 为Amazon SQS代理配置celery（从列表中获取所需的端点： http ： //docs.aws.amazon.com/general/latest/gr/rande.html ） root_folder / django_app / settings.py ：

...
CELERY_RESULT_BACKEND = 'django-db'
CELERY_BROKER_URL = 'sqs://%s:%s@' % (aws_access_key_id, aws_secret_access_key)
# Due to error on lib region N Virginia is used temporarily. please set it on Ireland "eu-west-1" after fix.
CELERY_BROKER_TRANSPORT_OPTIONS = {
    "region": "eu-west-1",
    'queue_name_prefix': 'django_app-%s-' % os.environ.get('APP_ENV', 'dev'),
    'visibility_timeout': 360,
    'polling_interval': 1
}
...

Celery configuration for django django_app app django django_app应用程序的芹菜配置

Add file root_folder/django_app/celery.py : 添加文件root_folder / django_app / celery.py ：

from __future__ import absolute_import, unicode_literals
import os
from celery import Celery

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'django_app.settings')

app = Celery('django_app')

# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
#   should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')

# Load task modules from all registered Django app configs.
app.autodiscover_tasks()

Modify file root_folder/django_app/__init__.py : 修改文件root_folder / django_app / __ init__.py ：

from __future__ import absolute_import, unicode_literals

# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from django_app.celery import app as celery_app

__all__ = ['celery_app']

Check also: 检查还：

How do you run a worker with AWS Elastic Beanstalk? 如何使用AWS Elastic Beanstalk运行工作人员？ (solution without scalability) （无可扩展性的解决方案）
Pip Requirements.txt --global-option causing installation errors with other packages. Pip Requirements.txt --global-option导致其他包的安装错误。 "option not recognized" (solution for problems coming from obsolate pip on elastic beanstalk that cannto deal with global options for properly solving pycurl dependency) “选项未被识别” （来自弹性beanstalk的绝对点的问题的解决方案，无法处理正确解决pycurl依赖性的全局选项）

Answer 2

This is how I extended the answer by @smentek to allow for multiple worker instances and a single beat instance - same thing applies where you have to protect your leader. 这就是我通过@smentek扩展答案以允许多个工作者实例和单个节拍实例的方式 - 同样的事情适用于您必须保护您的领导者的地方。 (I still don't have an automated solution for that yet). （我还没有自动解决方案）。

Please note that envvar updates to EB via the EB cli or the web interface are not relflected by celery beat or workers until app server restart has taken place. 请注意，在应用程序服务器重新启动之前，芹菜节拍或工作人员不会通过EB cli或Web界面更新EB的envvar更新。 This caught me off guard once. 这让我措手不及。

A single celery_configuration.sh file outputs two scripts for supervisord, note that celery-beat has autostart=false , otherwise you end up with many beats after an instance restart: 单个celery_configuration.sh文件为supervisord输出两个脚本，请注意celery-beat具有autostart=false ，否则在实例重启后最终会出现许多节拍：

# get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}

# create celery beat config script
celerybeatconf="[program:celeryd-beat]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery beat -A lexvoco --loglevel=INFO --workdir=/tmp -S django --pidfile /tmp/celerybeat.pid

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-beat.log
stderr_logfile=/var/log/celery-beat.log
autostart=false
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 10

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998

environment=$celeryenv"

# create celery worker config script
celeryworkerconf="[program:celeryd-worker]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A lexvoco --loglevel=INFO

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=999

environment=$celeryenv"

# create files for the scripts
echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf
echo "$celeryworkerconf" | tee /opt/python/etc/celeryworker.conf

# add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
  then
  echo "[include]" | tee -a /opt/python/etc/supervisord.conf
  echo "files: celerybeat.conf celeryworker.conf" | tee -a /opt/python/etc/supervisord.conf
fi

# reread the supervisord config
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf reread
# update supervisord in cache without restarting all services
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf update

Then in container_commands we only restart beat on leader: 然后在container_commands中我们只重新启动领导者：

container_commands:
  # create the celery configuration file
  01_create_celery_beat_configuration_file:
    command: "cat .ebextensions/files/celery_configuration.sh > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && sed -i 's/\r$//' /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
  # restart celery beat if leader
  02_start_celery_beat:
    command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat"
    leader_only: true
  # restart celery worker
  03_start_celery_worker:
    command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker"

Answer 3

If someone is following smentek's answer and getting the error: 如果有人关注smentek的回答并收到错误：

05_celery_tasks_run: /usr/bin/env bash does not exist.

know that, if you are using Windows, your problem might be that the "celery_configuration.txt" file has WINDOWS EOL when it should have UNIX EOL. 要知道，如果你使用的是Windows，你的问题可能是“celery_configuration.txt”文件在应该有UNIX EOL时有WINDOWS EOL。 If using Notepad++, open the file and click on "Edit > EOL Conversion > Unix (LF)". 如果使用Notepad ++，请打开文件并单击“编辑> EOL转换> Unix（LF）”。 Save, redeploy, and error is no longer there. 保存，重新部署和错误不再存在。

Also, a couple of warnings for really-amateur people like me: 还有像我这样的真正业余爱好者的几个警告：

Be sure to include "django_celery_beat" and "django_celery_results" in your "INSTALLED_APPS" in settings.py file. 请务必在settings.py文件中的“INSTALLED_APPS”中包含“django_celery_beat”和“django_celery_results”。
To check celery errors, connect to your instance with "eb ssh" and then "tail -n 40 /var/log/celery-worker.log" and "tail -n 40 /var/log/celery-beat.log" (where "40" refers to the number of lines you want to read from the file, starting from the end). 要检查芹菜错误，请使用“eb ssh”连接到您的实例，然后“tail -n 40 /var/log/celery-worker.log”和“tail -n 40 /var/log/celery-beat.log”（其中“40”指的是您想要从文件中读取的行数，从结尾开始）。

Hope this helps someone, it would've saved me some hours! 希望这对某人有所帮助，它会为我节省一些时间！

如何使用AWS Elastic Beanstalk可扩展的Django app运行芹菜工作者？

问题描述

3 个解决方案

解决方案1
25 已采纳 2016-12-15 10:19:36

解决方案2
5 2018-10-30 01:34:06

解决方案3
1 2019-01-25 09:10:17

如何使用AWS Elastic Beanstalk可扩展的Django app运行芹菜工作者？

问题描述

3 个解决方案

解决方案1 25 已采纳 2016-12-15 10:19:36

解决方案2 5 2018-10-30 01:34:06

解决方案3 1 2019-01-25 09:10:17

解决方案1
25 已采纳 2016-12-15 10:19:36

解决方案2
5 2018-10-30 01:34:06

解决方案3
1 2019-01-25 09:10:17