简体   繁体   English

如何使用AWS Elastic Beanstalk可扩展的Django app运行芹菜工作者?

[英]How to run a celery worker with Django app scalable by AWS Elastic Beanstalk?

如何将Django与AWS Elastic Beanstalk一起使用,它也只能在主节点上通过芹菜运行任务?

This is how I set up celery with django on elastic beanstalk with scalability working fine. 这就是我在弹性豆茎上用django设置芹菜的方法,可扩展性很好。

Please keep in mind that 'leader_only' option for container_commands works only on environment rebuild or deployment of the App. 请记住,对于container_commands“leader_only”选项仅适用于环境重建或App的部署 If service works long enough, leader node may be removed by Elastic Beanstalk. 如果服务工作时间足够长,则可以通过Elastic Beanstalk删除领导节点。 To deal with that, you may have to apply instance protection for your leader node. 要解决此问题,您可能必须为领导节点应用实例保护。 Check: http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html#instance-protection-instance 检查: http//docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html#instance-protection-instance

Add bash script for celery worker and beat configuration. 为芹菜工人添加bash脚本并节拍配置。

Add file root_folder/.ebextensions/files/celery_configuration.txt : 添加文件root_folder / .ebextensions / files / celery_configuration.txt

#!/usr/bin/env bash

# Get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}

# Create celery configuraiton script
celeryconf="[program:celeryd-worker]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A django_app --loglevel=INFO

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998

environment=$celeryenv

[program:celeryd-beat]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery beat -A django_app --loglevel=INFO --workdir=/tmp -S django --pidfile /tmp/celerybeat.pid

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-beat.log
stderr_logfile=/var/log/celery-beat.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998

environment=$celeryenv"

# Create the celery supervisord conf script
echo "$celeryconf" | tee /opt/python/etc/celery.conf

# Add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
  then
  echo "[include]" | tee -a /opt/python/etc/supervisord.conf
  echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
fi

# Reread the supervisord config
supervisorctl -c /opt/python/etc/supervisord.conf reread

# Update supervisord in cache without restarting all services
supervisorctl -c /opt/python/etc/supervisord.conf update

# Start/Restart celeryd through supervisord
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat
supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker

Take care about script execution during deployment, but only on main node (leader_only: true). 在部署期间注意脚本执行,但仅限于主节点(leader_only:true)。 Add file root_folder/.ebextensions/02-python.config : 添加文件root_folder / .ebextensions / 02-python.config

container_commands:
  04_celery_tasks:
    command: "cat .ebextensions/files/celery_configuration.txt > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
    leader_only: true
  05_celery_tasks_run:
    command: "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
    leader_only: true

File requirements.txt 文件requirements.txt

celery==4.0.0
django_celery_beat==1.0.1
django_celery_results==1.0.1
pycurl==7.43.0 --global-option="--with-nss"

Configure celery for Amazon SQS broker (Get your desired endpoint from list: http://docs.aws.amazon.com/general/latest/gr/rande.html ) root_folder/django_app/settings.py : 为Amazon SQS代理配置celery(从列表中获取所需的端点: http//docs.aws.amazon.com/general/latest/gr/rande.htmlroot_folder / django_app / settings.py

...
CELERY_RESULT_BACKEND = 'django-db'
CELERY_BROKER_URL = 'sqs://%s:%s@' % (aws_access_key_id, aws_secret_access_key)
# Due to error on lib region N Virginia is used temporarily. please set it on Ireland "eu-west-1" after fix.
CELERY_BROKER_TRANSPORT_OPTIONS = {
    "region": "eu-west-1",
    'queue_name_prefix': 'django_app-%s-' % os.environ.get('APP_ENV', 'dev'),
    'visibility_timeout': 360,
    'polling_interval': 1
}
...

Celery configuration for django django_app app django django_app应用程序的芹菜配置

Add file root_folder/django_app/celery.py : 添加文件root_folder / django_app / celery.py

from __future__ import absolute_import, unicode_literals
import os
from celery import Celery

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'django_app.settings')

app = Celery('django_app')

# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
#   should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')

# Load task modules from all registered Django app configs.
app.autodiscover_tasks()

Modify file root_folder/django_app/__init__.py : 修改文件root_folder / django_app / __ init__.py

from __future__ import absolute_import, unicode_literals

# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from django_app.celery import app as celery_app

__all__ = ['celery_app']

Check also: 检查还:

This is how I extended the answer by @smentek to allow for multiple worker instances and a single beat instance - same thing applies where you have to protect your leader. 这就是我通过@smentek扩展答案以允许多个工作者实例和单个节拍实例的方式 - 同样的事情适用于您必须保护您的领导者的地方。 (I still don't have an automated solution for that yet). (我还没有自动解决方案)。

Please note that envvar updates to EB via the EB cli or the web interface are not relflected by celery beat or workers until app server restart has taken place. 请注意,在应用程序服务器重新启动之前,芹菜节拍或工作人员不会通过EB cli或Web界面更新EB的envvar更新。 This caught me off guard once. 这让我措手不及。

A single celery_configuration.sh file outputs two scripts for supervisord, note that celery-beat has autostart=false , otherwise you end up with many beats after an instance restart: 单个celery_configuration.sh文件为supervisord输出两个脚本,请注意celery-beat具有autostart=false ,否则在实例重启后最终会出现许多节拍:

# get django environment variables
celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
celeryenv=${celeryenv%?}

# create celery beat config script
celerybeatconf="[program:celeryd-beat]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery beat -A lexvoco --loglevel=INFO --workdir=/tmp -S django --pidfile /tmp/celerybeat.pid

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-beat.log
stderr_logfile=/var/log/celery-beat.log
autostart=false
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 10

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998

environment=$celeryenv"

# create celery worker config script
celeryworkerconf="[program:celeryd-worker]
; Set full path to celery program if using virtualenv
command=/opt/python/run/venv/bin/celery worker -A lexvoco --loglevel=INFO

directory=/opt/python/current/app
user=nobody
numprocs=1
stdout_logfile=/var/log/celery-worker.log
stderr_logfile=/var/log/celery-worker.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=999

environment=$celeryenv"

# create files for the scripts
echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf
echo "$celeryworkerconf" | tee /opt/python/etc/celeryworker.conf

# add configuration script to supervisord conf (if not there already)
if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
  then
  echo "[include]" | tee -a /opt/python/etc/supervisord.conf
  echo "files: celerybeat.conf celeryworker.conf" | tee -a /opt/python/etc/supervisord.conf
fi

# reread the supervisord config
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf reread
# update supervisord in cache without restarting all services
/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf update

Then in container_commands we only restart beat on leader: 然后在container_commands中我们只重新启动领导者:

container_commands:
  # create the celery configuration file
  01_create_celery_beat_configuration_file:
    command: "cat .ebextensions/files/celery_configuration.sh > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && sed -i 's/\r$//' /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
  # restart celery beat if leader
  02_start_celery_beat:
    command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat"
    leader_only: true
  # restart celery worker
  03_start_celery_worker:
    command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker"

If someone is following smentek's answer and getting the error: 如果有人关注smentek的回答并收到错误:

05_celery_tasks_run: /usr/bin/env bash does not exist.

know that, if you are using Windows, your problem might be that the "celery_configuration.txt" file has WINDOWS EOL when it should have UNIX EOL. 要知道,如果你使用的是Windows,你的问题可能是“celery_configuration.txt”文件在应该有UNIX EOL时有WINDOWS EOL。 If using Notepad++, open the file and click on "Edit > EOL Conversion > Unix (LF)". 如果使用Notepad ++,请打开文件并单击“编辑> EOL转换> Unix(LF)”。 Save, redeploy, and error is no longer there. 保存,重新部署和错误不再存在。

Also, a couple of warnings for really-amateur people like me: 还有像我这样的真正业余爱好者的几个警告:

  • Be sure to include "django_celery_beat" and "django_celery_results" in your "INSTALLED_APPS" in settings.py file. 请务必在settings.py文件中的“INSTALLED_APPS”中包含“django_celery_beat”和“django_celery_results”。

  • To check celery errors, connect to your instance with "eb ssh" and then "tail -n 40 /var/log/celery-worker.log" and "tail -n 40 /var/log/celery-beat.log" (where "40" refers to the number of lines you want to read from the file, starting from the end). 要检查芹菜错误,请使用“eb ssh”连接到您的实例,然后“tail -n 40 /var/log/celery-worker.log”和“tail -n 40 /var/log/celery-beat.log”(其中“40”指的是您想要从文件中读取的行数,从结尾开始)。

Hope this helps someone, it would've saved me some hours! 希望这对某人有所帮助,它会为我节省一些时间!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在AWS Elastic Beanstalk上运行芹菜工作者? - How to run a celery worker on AWS Elastic Beanstalk? 您如何使用 AWS Elastic Beanstalk 运行工作线程? - How do you run a worker with AWS Elastic Beanstalk? 无法获取在弹性 beanstalk django 应用程序上运行的监督 celery 工作进程 - Unable to get supervisord celery worker process running on elastic beanstalk django app 让Django芹菜工作者开始使用Elastic-Beanstalk的问题 - Issue getting django celery worker started on elastic-beanstalk 如何设置 cronjob 以在 AWS 弹性 beanstalk 中运行 Django 管理命令 - How to set a cronjob to run a Django management command in AWS elastic beanstalk 如何在AWS Elastic Beanstalk上使用Ubuntu 14.04获取Python Django应用程序 - How to use Ubuntu 14.04 on AWS Elastic Beanstalk for a Python Django app 使用 AWS Elastic Beanstalk 托管 Django 应用程序时出错 - Error in hosting Django app with AWS Elastic Beanstalk 芹菜可以在Elastic Beanstalk上运行吗? - Can Celery run on Elastic Beanstalk? Elastic Beanstalk中的Django POST侦听器以接收AWS Worker Tier请求 - Django POST listener in Elastic Beanstalk to receive AWS Worker Tier requests 在 Elastic Beanstalk 上启动 SQS celery worker - Start SQS celery worker on Elastic Beanstalk
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM