簡體   English   中英

TypeError:期望的httplib.Message,得到了 <type 'instance'> 。 在GAE上使用requests.get(url)時

[英]TypeError: expected httplib.Message, got <type 'instance'>. when using requests.get(url) on GAE

我的目標是構建一個Web爬蟲並在GAE上托管它。 但是,當我嘗試執行一個非常基本的實現時,我收到以下錯誤:

    Traceback (most recent call last):
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "E:\WSE_NewsClusteriing\crawler\crawler.py", line 14, in get
    source_code = requests.get(url)
  File "libs\requests\api.py", line 67, in get
    return request('get', url, params=params, **kwargs)
  File "libs\requests\api.py", line 53, in request
    return session.request(method=method, url=url, **kwargs)
  File "libs\requests\sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "libs\requests\sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "libs\requests\adapters.py", line 376, in send
    timeout=timeout
  File "libs\requests\packages\urllib3\connectionpool.py", line 559, in urlopen
    body=body, headers=headers)
  File "libs\requests\packages\urllib3\connectionpool.py", line 390, in _make_request
    assert_header_parsing(httplib_response.msg)
  File "libs\requests\packages\urllib3\util\response.py", line 49, in assert_header_parsing
    type(headers)))
TypeError: expected httplib.Message, got <type 'instance'>.

我的main.py如下:

import sys
sys.path.insert(0, 'libs')

import webapp2
import requests
from bs4 import BeautifulSoup

class MainPage(webapp2.RequestHandler):
    def get(self):
        self.response.headers['Content-Type'] = 'text/plain'
        url = 'http://www.bbc.com/news/world'
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('a', {'class': 'title-link'}):
            href = 'http://www.bbc.com' + link.get('href')
            self.response.write(href)


app = webapp2.WSGIApplication([
    ('/', MainPage),
], debug=True)

問題是爬蟲作為一個獨立的python應用程序工作正常。

有人能幫我弄清楚這里有什么問題嗎? 請求模塊是否會導致與GAE的一些兼容性問題?

我建議暫時不要在App Engine上使用requests庫,因為它沒有得到官方支持。 因此很可能遇到兼容性問題。 根據URL Fetch Python API文章,支持的庫包括urlliburllib2httplib和直接使用urlfetch requests庫的某些功能也可能基於urllib3庫,因為它們的協作 此庫尚不支持。

有關urllib2urlfetch請求的簡單示例,請隨時查閱URL Fetch 如果這些圖書館不適合您的某種方式,請隨時在您的問題中指出我們。

這是近兩年的問題,但我實際上偶然發現了這個問題。 為了那些可能遇到類似問題的人的利益,docs描述了如何發出HTTP(S)請求

import requests
import requests_toolbelt.adapters.appengine

# Use the App Engine Requests adapter. This makes sure that Requests uses
# URLFetch.
requests_toolbelt.adapters.appengine.monkeypatch()

Referance https://cloud.google.com/appengine/docs/standard/python/issue-requests

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM