簡體   English   中英

使用urllib2的Python回溯錯誤

[英]Python traceback error using urllib2

我真的很困惑,是Python的新手,我正在編寫一個腳本,該腳本抓取了一個使用Python27的產品的網站。 我正在嘗試使用urllib2來執行此操作,當我運行腳本時,它會打印多個回溯錯誤。 有什么建議嗎?

腳本:

import urllib2, zlib, json

url='https://launches.endclothing.com/api/products'
req = urllib2.Request(url)
req.add_header(':host','launches.endclothing.com');req.add_header(':method','GET');req.add_header(':path','/api/products');req.add_header(':scheme','https');req.add_header(':version','HTTP/1.1');req.add_header('accept','application/json, text/plain, */*');req.add_header('accept-encoding','gzip,deflate');req.add_header('accept-language','en-US,en;q=0.8');req.add_header('cache-control','max-age=0');req.add_header('cookie','__/');req.add_header('user-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/37.0.2062.120 Chrome/37.0.2062.120 Safari/537.36');
resp = urllib2.urlopen(req).read()
resp = zlib.decompress(bytes(bytearray(resp)),15+32)
data = json.loads(resp)
for product in data:
    for attrib in product.keys():
        print str(attrib)+' :: '+ str(product[attrib])
    print '\n'

錯誤:

C:\Users\Luke>py C:\Users\Luke\Documents\EndBot2.py
Traceback (most recent call last):
  File "C:\Users\Luke\Documents\EndBot2.py", line 5, in <module>
    resp = urllib2.urlopen(req).read()
  File "C:\Python27\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 391, in open
    response = self._open(req, data)
  File "C:\Python27\lib\urllib2.py", line 409, in _open
    '_open', req)
  File "C:\Python27\lib\urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 1181, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "C:\Python27\lib\urllib2.py", line 1148, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:499: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error>

您的請求的SSL配置遇到問題。 我很抱歉,但我不會糾正你的代碼,因為我們是在2016年,這里面的,你應該使用一個美好的庫: 請求

所以它的用法很簡單:

>>> user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'
>>> result = requests.get('https://launches.endclothing.com/api/products', headers={'user-agent': user_agent})
>>> result
<Response [200]>
>>> result.json()
[{u'name': u'Adidas Consortium x HighSnobiety Ultraboost', u'colour': u'Grey', u'id': 30, u'releaseDate': u'2016-04-09T00:01:00+0100', …

您會注意到,我在上一個查詢中更改了用戶代理以使其正常運行,因為很奇怪,該網站拒絕對requests API訪問:

>>> result = requests.get('https://launches.endclothing.com/api/products')
>>> result
<Response [403]>
>>> result.text
This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.</p></div><div class="error-right"><h3>What can I do to resolve this?</h3><p>If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p><p>If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.

否則,既然您已經嘗試過requests並且生活已經改變,那么您仍然可能再次遇到此問題。 正如您可能在Internet上的許多地方讀到的那樣,這與SNI和過時的庫有關,嘗試弄清楚這一點可能會令人頭疼。 最好的建議是升級到Python3,因為可能會通過安裝新的原始版本的python和所涉及的libs解決此問題。

高溫超導

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM