简体   繁体   中英

Periodic and Non periodic tasks with Django + Telegram + Celery

I am building a project based on Django and one of my intentions is to have a telegram bot which is receiving information from a Telegram group. I was able to implement the bot to send messages in Telegram, no issues.

In this moment I have a couple of Celery tasks which are running with Beat and also the Django web, which are decopled. All good here.

I have seen that the python-telegram-bot is running a function in one of the examples ( https://github.com/python-telegram-bot/python-telegram-bot/blob/master/examples/echobot.py ) which is waiting idle to receive data from Telegram. Now, all my tasks in Celery are in this moment periodic and are called each 10 or 60 minutes by Beat. How can I run this non-periodic task with Celery in my configuration? I am saying non-periodic because I understood that it will wait for content until it is manually interrupted.

  • Django~=3.2.6

  • celery~=5.1.2

     CELERY_BEAT_SCHEDULE = { 'task_1': { 'task': 'apps.envc.tasks.Fetch1', 'schedule': 600.0, }, 'task_2': { 'task': 'apps.envc.tasks.Fetch2', 'schedule': crontab(minute='*/60'), }, 'task_3': { 'task': 'apps.envc.tasks.Analyze', 'schedule': 600, },

    }

In my tasks.py I have one of the tasks like this:

@celery_app.task(name='apps.envc.tasks.TelegramBot')
def TelegramBot():
    status = start_bot()
    return status

And as the start_bot implemenation, I simply copied the echobot.py example and I have added my TOKEN there (of course the functions for different commands from the example are also there).

In PTB, Updater.start_polling/webhook() starts a background thread that waits for incoming updates. Updater.idle() blocks the main thread and when receiving a stop signal, it ends the background thread mentioned above.

I'm not familiar with Celery and only know the basics of Django, but I see a few options here that I'd like to point out.

  • You can run the PTB-related code in a standalone thread, ie a thread that calls Updater.start_polling and Updater.idle . To end that thread on shutdown, you'll have to forward the stop signal to that thread
  • Vice versa, you can run PTB in the main thread and the Django & Celeray related tasks in a standalone thread
  • You don't have to use Updater . Since you're using Django anyway, you could switch to a webhook-based solution for receiving updates, where Django serves as webhook for you. You can even eliminate threading for PTB completely by calling Dispatcher.process_update manually. Please see this wiki page for more info on custom webhook solutions
  • Finally, I'd like to mention that PTB comes with a built-in solution of scheduling tasks, see the wiki page on Job Queue . This may or may not be relevant for you depending on your setup.

Dislaimer: I'm currently the maintainer of python-telegram-bot

Set up a webhook instead of polling with Celery

With Django, you shouldn't be using Celery to run Telegram polling (what you call PTB's “non-periodic task”, which is better described as a long-running process or service). Celery is designed for definite tasks, not indefinitely-running processes.

As Django implies that you're already running a web server, then the webhook option is a better fit. (Remember that you can either do polling or set up a webhook in order to receive updates from Telegram's servers.) The option that @CallMeStag suggested, of using a non-threading webhook setup , makes the most sense for Django-PTB integration.

You can do the bot setup (defining and registering your handler functions on a Dispatcher instance) in a separate module; to avoid threading, you should pass update_queue=None, workers=0 to your Dispatcher instantiation. And then, use it in a Django view, like this:

import json
from django.views.decorators.csrf import csrf_exempt
from telegram import Update

from .telegram_init import telegram_bot, telegram_dispatcher

...

@csrf_exempt
def telegram_webhook(request):
    data = json.loads(request.body)
    update = Update.de_json(data, telegram_bot)
    telegram_dispatcher.process_update(update)

    return JsonResponse({})

where telegram_bot is the Bot instance that I use for instantiating telegram_dispatcher . (I left out error handling in this snippet.)

Why avoid threading? Threads in the more general sense are not forbidden in Django, but in the context of PTB, threading usually means running bot updaters or dispatchers in a long-running thread that share an update/message queue, and that's a complication that doesn't look nice nor play well with, for example, a typical Django deployment that uses multiple Gunicorn workers in separate processes. There is , however, a motivation for using multithreading (multiple processes, actually, using Celery) in Django-PTB integration; see below.

Development environment caveat

The above setup is what you'd want to use for a basic production system. But during dev, unless your dev machine is internet-facing with a fixed IP, you probably can't use a webhook, so you'd still want to do polling. One way to do this is by creating a custom Django management command :

<my_app>/management/commands/polltelegram.py :

from django.core.management.base import BaseCommand

from my_django_project.telegram_init import telegram_updater


class Command(BaseCommand):
    help = 'Run Telegram bot polling.'

    def handle(self, *args, **options):
        updater.start_polling()
        self.stdout.write(
            'Telegram bot polling started. '
            'Press CTRL-BREAK to terminate.'
        )
        updater.idle()
        self.stdout.write('Polling stopped.')

And then, during dev, run python manage.py polltelegram to fetch and process Telegram updates. (Run this along with python manage.py runserver to be able to use the main Django app simultaneously; the polling runs in a separate process with this setup, not just a separate thread.)

When Celery makes sense

Celery does have a role to play if you're integrating PTB with Django, and this is when reliability is a concern. For instance, when you want to be able to retry sending replies in case of transient network issues. Another potential issue is that the non-threading webhook setup detailed above can, in a high-traffic scenario, run into flood/rate limits . PTB's current solution for this, MessageQueue , uses threading, and while it can work, it can introduce other problems, for example interference with Django's autoreload function when running runserver during dev.

A more elegant and reliable solution is to use Celery to run the message sending function of PTB. This allows for retries and rate limiting for better reliability.

Briefly described, this integration can still use the non-threading webhook setup above, but you have to isolate the Bot.send_message() function into a Celery task, and then make sure that all handlers call this Celery task asynchronously instead of using the bot to run send_message() in the webhook process 'eagerly'.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM