简体   繁体   English

python request.get中的特定字符串给出ValueError

[英]Specific string in python requests.get gives ValueError

Trying to get this specific page... 尝试获取此特定页面...

request = requests.get('http://market.yandex.ru/catalog/90555/list')

...gives me a strange error: ...给我一个奇怪的错误:

ValueError                                Traceback (most recent call last)
    C:\Python34\lib\site-packages\requests\packages\urllib3\response.py in read_chunked(self, amt)
    406                 try:
--> 407                     self.chunk_left = int(line, 16)
    408                 except ValueError:

ValueError: invalid literal for int() with base 16: ''

I figured out that some part of a string is to blame. 我发现字符串的某些部分应该受到谴责。 I was experimenting with it and the results are even more weird: 我正在尝试它,结果更加奇怪:

# No error
http://market.ru/catalog/90555/list
http://market.yandex.ru/catalo

# Error
http://market.yandex.ru/catalog

PS By the way, the problem has occured today. PS:顺便说一句,今天已经发生了问题。 Just recently I had no problems with getting this very page (using the same method). 就在最近,我在使用此页面(使用相同方法)时没有任何问题。

You are being rate limited, but the server does so in a way that violates the HTTP specification. 您受到速率的限制,但是服务器这样做的方式违反了HTTP规范。 Their response headers promise a Chunked transfer encoding, then do not send such a response. 它们的响应标头承诺进行传输编码,然后不发送此类响应。

If you look at the URL with curl in verbose mode, you get the following output: 如果以详细模式查看带有curl的URL,则会得到以下输出:

$ curl -v https://market.yandex.ru/catalog/90555/list
* Hostname was NOT found in DNS cache
*   Trying 213.180.204.22...
* Connected to market.yandex.ru (213.180.204.22) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: market.yandex.ru
* Server certificate: Certum Level IV CA
* Server certificate: Certum CA
> GET /catalog/90555/list HTTP/1.1
> User-Agent: curl/7.37.1
> Host: market.yandex.ru
> Accept: */*
> 
< HTTP/1.1 302 Found
* Server nginx is not blacklisted
< Server: nginx
< Date: Mon, 18 May 2015 18:53:15 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Keep-Alive: timeout=120
< X-Forwardtouser-Y: 1
< Set-Cookie: spravka=dD0xNDAwNDM5MTk1O2k9ODQuOTIuOTguMTcwO3U9MTQwMDQzOTE5NTUxNjUwOTExMjtoPWNkMzVlMzBlMjgxMTg4YWM0YjYyZDg3OTg4ZjUyNWFj; domain=.yandex.ru; path=/; expires=Wed, 17-Jun-2015 18:53:15 GMT
< Location: http://market.yandex.ru/showcaptcha?cc=1&retpath=http%3A//market.yandex.ru/catalog/90555/list%3F_bfd13d35fbf1551a835f050d3775fc4b&t=0/1431975195/029660aeb063916c78e30ebd9444fd4b&s=4dd645e7048b399008278208fa776ba9
< Set-Cookie: uid=CniLolVaNRthdR2JDtV0Ag==; path=/
< 
* transfer closed with outstanding read data remaining
* Closing connection 0
curl: (18) transfer closed with outstanding read data remaining

The are sending you a redirect , but the Transfer-Encoding: chunked header in the response means that the client side has to load chunks, which are not there. 会向您发送重定向 ,但是响应中的Transfer-Encoding: chunked标头意味着客户端必须加载不存在的块。

The redirect leads to a captcha: 重定向导致验证码:

http://market.yandex.ru/showcaptcha?cc=1&retpath=http%3A//market.yandex.ru/catalog/90555/list%3F_bfd13d35fbf1551a835f050d3775fc4b&t=0/1431975195/029660aeb063916c78e30ebd9444fd4b&s=4dd645e7048b399008278208fa776ba9
#                       ^^^^^^^^^^^

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM