简体   繁体   中英

How to run a task with celery in django and save result in django database?

I have made a scraper to scrape some links from web page and want to run this scraper every 1 hours which resides in django app, but django it is impossible to run a scraper every 1 hours because the django views depends on the request response object. to solve this problem I have decided to use a python library named celery and according to the documentation I have write celery.py and tasks.py files

By django project structure is like this

newsportal
 - newsportal
    -settings.py 
    -celery.py
    __init__.py
 - news
    -tasks.py
    -views.py 
    -models.py

celery.py has the following code

from __future__ import absolute_import

import os

from celery import Celery

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'newsportal.settings')

from django.conf import settings  # noqa

app = Celery('newsportal')

# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)


@app.task(bind=True)
def debug_task(self):
    print('Request: {0!r}'.format(self.request))

__init__.py file has the following lines of code

from __future__ import absolute_import

# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app  # noqa

while as tasks.py has the following lines of code

from __future__ import absolute_import
from celery import shared_task
from crawler import crawler
from .models import News

@shared_task
def news():
    '''
    scrape all links
    '''
    news = [] #store dict object
    allnews.append(crawler()) 
    for news_dict in allnews:
        for news, url  in news_dict.items():
            #Save all the scrape news in database
            News.objects.create(title=news, url=url, source=source)

what I want to do is to run the above news() function every 1 hours and save the result to the database.

I want to save the result of the tasks to the django database, how can I achive this.

according to the celery docs, to save the result given by the worker we need install django-celery==3.1.17 , as I have already installed, and do migration. 在此处输入图像描述

For the database backend in celery according to celery docs, we should put

app.conf.update(
    CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
)



line of code on settings.py file, on putting this of code in `settings.py` file I got the error of

settings.py", line 141, in <module>
    app.conf.update(
NameError: name 'app' is not defined

as I have already Import and put the following line of code in settings.py file as below

from __future__ import absolute_import
BROKER_URL = 'redis://localhost'

The main thing I want to do is,

  1. Running the above crawler every 1 hour and saving the result of crawler in databse called news How can I accomplish this using celery or am I missing something ?

Are there any other alternatives way to accomplish this task

I believe you would use app.conf.update(...) in your celery.py if you wanted to add that configuration there.

Your app.config_from_object('django.conf:settings') call in celery.py indicates that you're loading the configuration settings from your settings.py file though.

So you should just be able to put CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend' at the end of your settings.py file instead.

This should prevent you from getting that error.

I know this is a little late however I can highly recommend the Django Celery Result package found here .

Installation is straight forward and the package is recommended by Celery itself. Simply return some output from your task and it will be stored in the database and accessible under the Django admin.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM