简体   繁体   English

使用gevent与金字塔

[英]Using gevent with pyramid

I'm building a website using pyramid, and I want to fetch some data from other websites. 我正在使用金字塔构建一个网站,我想从其他网站获取一些数据。 Because there may be 50+ calls of urlopen , I wanted to use gevent to speed things up. 因为可能有50多个urlopen调用,我想使用gevent来加快速度。

Here's what I've got so far using gevent: 这是我到目前为止使用gevent所得到的:

import urllib2    
from gevent import monkey; monkey.patch_all()
from gevent import pool

gpool = gevent.pool.Pool()

def load_page(url):
    response = urllib2.urlopen(url)
    html = response.read()
    response.close()
    return html

def load_pages(urls):
    return gpool.map(load_page, urls)

Running pserve development.ini --reload gives: 运行pserve development.ini --reload给出:

NotImplementedError: gevent is only usable from a single thread . NotImplementedError: gevent is only usable from a single thread

I've read that I need to monkey patch before anything else, but I'm not sure where the right place is for that. 我已经读过,我需要在其他任何事情之前修补补丁,但我不确定哪个地方适合那个地方。 Also, is this a pserve-specific issue? 此外,这是一个特定于问题的问题吗? Will I need to re-solve this problem when I move to mod_wsgi ? 当我转到mod_wsgi时,我需要重新解决这个问题吗? Or is there a way to handle this use-case (just urlopen) without gevent? 或者有没有办法处理这个用例(只是urlopen)没有gevent? I've seen suggestions for requests but I couldn't find an example of fetching multiple pages in the docs. 我已经看到了对请求的建议,但我找不到在文档中获取多个页面的示例。

Update 1: 更新1:

I also tried eventlet from this SO question (almost directly copied from this eventlet example ): 我也从这个SO问题尝试了eventlet(几乎直接从这个eventlet 例子中复制):

import eventlet
from eventlet.green import urllib2

def fetch(url):
    return urllib2.urlopen(url).read()

def fetch_multiple(urls):
    pool = eventlet.GreenPool()
    return pool.imap(fetch, urls)

However when I call fetch_multiple , I'm getting TypeError: request() got an unexpected keyword argument 'return_response' 但是,当我调用fetch_multiple ,我得到TypeError: request() got an unexpected keyword argument 'return_response'

Update 2: 更新2:

The TypeError from the previous update was likely from earlier attempts to monkeypatch with gevent and not properly restarting pserve. 上一次更新中的TypeError可能来自早期尝试使用gevent进行monkeypatch并且未正确重新启动pserve。 Once I restarted everything, it works properly. 一旦我重新启动了一切,它就能正常工作。 Lesson learned. 学过的知识。

There are multiple ways to do what you want: 有多种方法可以做你想要的:

  • Create a dedicated gevent thread, and explicitly dispatch all of your URL-opening jobs to that thread, which will then do the gevented urlopen requests. 创建一个专用的gevent线程,并明确地将所有的URL打开作业分派给该线程,然后执行gevented urlopen请求。
  • Use threads instead of greenlets. 使用线程而不是greenlet。 Running 50 threads isn't going to tax any modern OS. 运行50个线程不会对任何现代操作系统征税。
  • Use a thread pool and a queue. 使用线程池和队列。 There's usually not much advantage to doing 50 downloads at the same time instead of, say, 8 at a time (as your browser probably does). 同时进行50次下载通常没有多大优势,而不是一次只进行8次下载(正如您的浏览器所做的那样)。
  • Use a different async framework instead of gevent , one that doesn't work by magically greenletifying your code. 使用不同的异步框架而不是gevent ,一个不能通过神奇地对代码进行绿化的工作。
  • Use a library that has its own non-magic async support, like pycurl . 使用具有自己的非魔法异步支持的库,如pycurl
  • Instead of mixing and matching incompatible frameworks, build the server around gevent too, or find some other framework that works for both your web-serving and your web-client needs. 而不是混合和匹配不兼容的框架,也可以围绕gevent构建服务器,或者找到一些适用于Web服务和Web客户端需求的其他框架。

You could simulate the last one without changing frameworks by loading gevent first, and have it monkeypatch your threads, forcing your existing threaded server framework to become a gevent server. 您可以通过先加载gevent来模拟最后一个而不更改框架,并让它对您的线程进行monkeypatch,从而迫使您现有的线程服务器框架成为gevent服务器。 But this may not work, or mostly work but occasionally fail, or work but be much slower… Really, using a framework designed to be gevent -friendly (or at least greenlet-friendly) is a much better idea, if that's the way you want to go. 但是这可能不起作用,或者大多数是工作但偶尔会失败,或者工作但速度要慢得多......真的,使用一个设计为gevent友好(或至少是绿色友好)的框架是一个更好的主意,如果这是你的方式想去。

You mentioned that others had recommended requests . 你提到其他人推荐过requests The reason you can't find the documentation is that the built-in async code in requests was removed. 您找不到文档的原因是requests中的内置异步代码已被删除。 See, an older version for how it was used. 请参阅旧版本 ,了解其使用方法。 It's now available as a separate library, grequests . 它现在作为一个单独的库, grequests However, it works by implicitly wrapping requests with gevent , so it will have exactly the same issues as doing so yourself. 但是,它通过使用gevent隐式包装requestsgevent ,因此它将与gevent自己完全相同的问题。

(There are other reasons to use requests instead of urllib2 , and if you want to gevent it it's easier to use grequests than to do it yourself.) (还有其他原因使用requests ,而不是urllib2 ,如果你想gevent它更容易使用grequests比自己做。)

I've had similar problems with gevent when trying to deploy a web application. 尝试部署Web应用程序时,我遇到了与gevent类似的问题。 The thing you could do that would take the least hassle is to use a WSGI deployment that runs on gevent; 您可以做的事情是最简单的麻烦是使用在gevent上运行的WSGI部署; examples include gUnicorn, uWSGI, or one of gevent's built-in WSGI servers. 示例包括gUnicorn,uWSGI或gevent的内置WSGI服务器之一。 Pyramid should have a way of using an alternate deployment. 金字塔应该有一种使用备用部署的方法。 If large portions of your code rely on gevent, it's easier to just use a server that runs on gevent as well. 如果代码的大部分依赖于gevent,那么只使用在gevent上运行的服务器也会更容易。

So, basically the last bullet on the above answer. 所以,基本上是上面答案的最后一个子弹。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM