简体   繁体   English

如何在django视图中使用python多处理模块

[英]How to use python multiprocessing module in django view

I have a simple function that go over a list of URLs, using GET to retrieve some information and update the DB ( PostgresSQL ) accordingly. 我有一个简单的函数,遍历一个URL列表,使用GET来检索一些信息并相应地更新DB( PostgresSQL )。 The function works perfect. 功能完美。 However, going over each URL one at a time talking too much time. 但是,每次访问每个URL一次会说太多时间。

Using python, I'm able to do to following to parallel these tasks: 使用python,我可以执行以下操作来并行执行这些任务:

from multiprocessing import Pool

def updateDB(ip):
     code goes here...

if __name__ == '__main__':
    pool = Pool(processes=4)              # process per core
    pool.map(updateDB, ip)

This is working pretty well. 这工作得很好。 However, I'm trying to find how do the same on django project. 但是,我正试图找到django项目的相同方法。 Currently I have a function (view) that go over each URL to get the information, and update the DB. 目前我有一个函数(视图),遍历每个URL以获取信息,并更新数据库。

The only thing I could find is using Celery, but this seems to be a bit overpower for the simple task I want to perform. 我唯一能找到的就是使用Celery,但这似乎对我想要执行的简单任务有点过分。

Is there anything simple that i can do or do I have to use Celery? 有什么简单的我可以做或者我必须使用芹菜吗?

Though using Celery may seem an overkill, it is a well-known way of doing asynchronous tasks. 尽管使用Celery似乎有些过分,但这是一种众所周知的异步任务方式。 Essentially Django serves WSGI request-response cycle which knows nothing of multiprocessing or background tasks. 本质上,Django提供WSGI请求 - 响应周期,它对多处理或后台任务一无所知。

Here are alternative options: 以下是备选方案:

Currently I have a function (view) that go over each URL to get the information, and update the DB. 目前我有一个函数(视图),遍历每个URL以获取信息,并更新数据库。

It means response time does not matter for you and instead of doing it in the background (asynchronously), you are OK with doing it in the foreground if your response time is cut by 4 (using 4 sub-processes/threads). 这意味着响应时间对您来说无关紧要,而不是在后台(异步)执行它,如果您的响应时间减少4(使用4个子进程/线程),您可以在前台执行它。 If that is the case you can simply put your sample code in your view. 如果是这种情况,您只需将示例代码放在视图中即可。 Like 喜欢

from multiprocessing import Pool

def updateDB(ip):
     code goes here...

def my_view(request):
    pool = Pool(processes=4)              # process per core
    pool.map(updateDB, ip)
    return HttpResponse("SUCCESS")

But, if you want to do it asynchronously in the background then you should use Celery or follow one of @BasicWolf's suggestions. 但是,如果你想在后台异步执行,那么你应该使用Celery或者遵循@ BasicWolf的建议。

I will recommend to use gevent for multithreading solution instead of multiprocessing. 我建议使用gevent进行多线程解决方案而不是多处理。 Multiprocessing can cause problem in production environment where spawning new processes are restricted. 多处理可能会导致产生新进程受限制的生产环境中出现问题。

Example code: 示例代码:

from django.shortcuts import HttpResponse
from gevent.pool import Pool

def square(number):
    return number * number

def home(request):
    pool = Pool(50)
    numbers = [1, 3, 5]
    results = pool.map(square, numbers)
    return HttpResponse(results)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM