简体   繁体   中英

Running asynchronous python code in a Django web application

Is it OK to run certain pieces of code asynchronously in a Django web app. If so how?

For example:

I have a search algorithm that returns hundreds or thousands of results. I want to enter into the database that these items were the result of the search, so I can see what users are searching most. I don't want the client to have to wait an extra hundred or thousand more database inserts. Is there a way I can do this asynchronously? Is there any danger in doing so? Is there a better way to achieve this?

As far as Django is concerned yes.

The bigger concern is your web server and if it plays nice with threading. For instance, the sync workers of gunicorn are single threads, but there are other engines, such as greenlet. I'm not sure how well they play with threads.

Combining threading and multiprocessing can be an issue if you're forking from threads:

Status of mixing multiprocessing and threading in Python

http://bugs.python.org/issue6721

That being said, I know of popular performance analytics utilities that have been using threads to report on metrics, so seems to be an accepted practice.

In sum, seems safest to use the threading.Thread object from the standard library, so long as whatever you do in it doesn't fork (python's multiprocessing library)

https://docs.python.org/2/library/threading.html

Offloading requests from the main thread is a common practice; as the end goal is to return a result to the client (browser) as quickly as possible.

As I am sure you are aware, HTTP is blocking - so until you return a response, the client cannot do anything (it is blocked, in a waiting state).

The de-facto way of offloading requests is through celery which is a task queuing system.

I highly recommend you read the introduction to celery topic, but in summary here is what happens:

  1. You mark certain pieces of codes as "tasks". These are usually functions that you want to run asynchronously.

  2. Celery manages workers - you can think of them as threads - that will run these tasks.

  3. To communicate with the worker a message queue is required. RabbitMQ is the one often recommended.

Once you have all the components running (it takes but a few minutes); your workflow goes like this:

  1. In your view, when you want to offload some work; you will call the function that does that work with the .delay() option. This will trigger the worker to start executing the method in the background.

  2. Your view then returns a response immediately.

  3. You can then check for the result of the task, and take appropriate actions based on what needs to be done. There are ways to track progress as well.

It is also good practice to include caching - so that you are not executing expensive tasks unnecessarily. For example, you might choose to offload a request to do some analytics on search keywords that will be placed in a report.

Once the report is generated, I would cache the results (if applicable) so that the same report can be displayed if requested later - rather than be generated again.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM