简体   繁体   English

使用urllib2的Python回溯错误

[英]Python traceback error using urllib2

I am really confused, new to Python and I am working on a script that scrapes a website for products on Python27. 我真的很困惑,是Python的新手,我正在编写一个脚本,该脚本抓取了一个使用Python27的产品的网站。 I am trying to use urllib2 to do this and when I run the script it prints multiple traceback errors. 我正在尝试使用urllib2来执行此操作,当我运行脚本时,它会打印多个回溯错误。 Suggestions? 有什么建议吗?

Script: 脚本:

import urllib2, zlib, json

url='https://launches.endclothing.com/api/products'
req = urllib2.Request(url)
req.add_header(':host','launches.endclothing.com');req.add_header(':method','GET');req.add_header(':path','/api/products');req.add_header(':scheme','https');req.add_header(':version','HTTP/1.1');req.add_header('accept','application/json, text/plain, */*');req.add_header('accept-encoding','gzip,deflate');req.add_header('accept-language','en-US,en;q=0.8');req.add_header('cache-control','max-age=0');req.add_header('cookie','__/');req.add_header('user-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/37.0.2062.120 Chrome/37.0.2062.120 Safari/537.36');
resp = urllib2.urlopen(req).read()
resp = zlib.decompress(bytes(bytearray(resp)),15+32)
data = json.loads(resp)
for product in data:
    for attrib in product.keys():
        print str(attrib)+' :: '+ str(product[attrib])
    print '\n'

Error(s): 错误:

C:\Users\Luke>py C:\Users\Luke\Documents\EndBot2.py
Traceback (most recent call last):
  File "C:\Users\Luke\Documents\EndBot2.py", line 5, in <module>
    resp = urllib2.urlopen(req).read()
  File "C:\Python27\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 391, in open
    response = self._open(req, data)
  File "C:\Python27\lib\urllib2.py", line 409, in _open
    '_open', req)
  File "C:\Python27\lib\urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 1181, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "C:\Python27\lib\urllib2.py", line 1148, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:499: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error>

You're running into issues with SSL configuration of your request. 您的请求的SSL配置遇到问题。 I'm sorry, but I won't correct your code, because we're in 2016, and there's a wonderful library that you should use instead: requests 我很抱歉,但我不会纠正你的代码,因为我们是在2016年,这里面的,你应该使用一个美好的库: 请求

So its usage is pretty simple: 所以它的用法很简单:

>>> user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'
>>> result = requests.get('https://launches.endclothing.com/api/products', headers={'user-agent': user_agent})
>>> result
<Response [200]>
>>> result.json()
[{u'name': u'Adidas Consortium x HighSnobiety Ultraboost', u'colour': u'Grey', u'id': 30, u'releaseDate': u'2016-04-09T00:01:00+0100', …

You'll notice that I changed the user-agent in the previous query to have it working, because weirdly enough, the website is refusing API access to requests : 您会注意到,我在上一个查询中更改了用户代理以使其正常运行,因为很奇怪,该网站拒绝对requests API访问:

>>> result = requests.get('https://launches.endclothing.com/api/products')
>>> result
<Response [403]>
>>> result.text
This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.</p></div><div class="error-right"><h3>What can I do to resolve this?</h3><p>If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p><p>If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.

Otherwise, now that you've tried requests and your life has changed, you might still run into this issue again. 否则,既然您已经尝试过requests并且生活已经改变,那么您仍然可能再次遇到此问题。 As you might read from many places on internet, this is related to SNI and outdated libraries and you might get headaches trying to figure this out. 正如您可能在Internet上的许多地方读到的那样,这与SNI和过时的库有关,尝试弄清楚这一点可能会令人头疼。 My best advice would be for you to upgrade to Python3, as the problem is likely to be solved by installing a new vanilla version of python and the libs involved. 最好的建议是升级到Python3,因为可能会通过安装新的原始版本的python和所涉及的libs解决此问题。

HTH 高温超导

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM