Python请求无效的URL标签错误

Question

I'm trying to access Shopify's API which uses a URL format of - https://apikey:password@hostname/admin/resource.xml 我正在尝试访问使用URL格式的Shopify的API - https://apikey:password@hostname/admin/resource.xml

eg http://7ea7a2ff231f9f7:95c5e8091839609c864@iliketurtles.myshopify.com/admin/orders.xml 例如： http://7ea7a2ff231f9f7:95c5e8091839609c864@iliketurtles.myshopify.com/admin/orders.xml

doing $curl api_url downloads the correct XML however when I do 做$curl api_url下载正确的XML然而当我这样做

 import requests
 api_url = 'http://7ea7a2ff231f9f7d:95c5e8091839609c864@iliketurtles.myshopify.com/admin/orders.xml'
 r = requests.get(api_url) # Invalid url label error

Any idea why I'm getting this? 知道为什么我得到这个吗？ Curl / opening the link directly in the browser is working fine. 直接在浏览器中滚动/打开链接工作正常。 Is it because the length of the URL is too long? 是因为URL的长度太长了吗？

Thanks! 谢谢！

Answer 1

It's not the length of the URL. 这不是URL的长度。 If I do: 如果我做：

import requests
test_url = 'http://www.google.com/?somereallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallylongurl=true'
r = requests.get(test_url)

returns <Response [200]> 返回<Response [200]>

Have you tried making the request with the requests Authentication parameters detailed here 您是否尝试使用此处详细说明的请求验证参数发出请求

>>> requests.get('http://iliketurtles.myshopify.com/admin/orders.xml', auth=('ea7a2ff231f9f7', '95c5e8091839609c864'))
<Response [403]>

Answer 2

The error ( 'URL has an invalid label.' ) is probably a bug in requests library: it applies idna encoding (for internationalized domain names) on hostname with userinfo attached, source : 错误（ 'URL has an invalid label.' ）可能是requests库中的一个错误：它在附加了userinfo的主机名上应用了idna编码（对于国际化域名），来源：

netloc = netloc.encode('idna').decode('utf-8')

that might raise 'label empty or too long' error for the long username:password. 这可能会导致长用户名：密码的“标签为空或太长”错误。 You can try to report it on the requests' issue tracker . 您可以尝试在请求的问题跟踪器上报告它。

a:b@example.com form is deprecated otherwise requests.get('https://a:b@example.com') should be equivalent to requests.get('https://example.com', auth=('a', 'b')) if all characters in username:password are from [-A-Za-z0-9._~!$&'()*+,;=] set. a:b@example.com表单已弃用，否则requests.get('https://a:b@example.com')应等同于requests.get('https://example.com', auth=('a', 'b')) requests.get('https://a:b@example.com') requests.get('https://example.com', auth=('a', 'b'))如果用户名：密码中的所有字符都来自[-A-Za-z0-9._~!$&'()*+,;=]设置。

curl and requests also differ then there are percent-encoded characters in userinfo eg, https://a:%C3%80@example.com leads to curl generating the following http header: curl和requests也不同，那么userinfo中有百分比编码的字符，例如https://a:%C3%80@example.com导致curl生成以下http标头：

Authorization: Basic YTrDgA==

but requests produces: 但requests产生：

Authorization: Basic YTolQzMlODA=

ie: 即：

>>> import base64
>>> base64.b64decode('YTrDgA==')
'a:\xc3\x80'
>>> print _
a:À
>>> base64.b64decode('YTolQzMlODA=')
'a:%C3%80'

Python请求无效的URL标签错误

问题描述

2 个解决方案

解决方案1
3 2013-05-09 20:29:59

解决方案2
3 已采纳 2013-05-10 00:29:24

Python请求无效的URL标签错误

问题描述

2 个解决方案

解决方案1 3 2013-05-09 20:29:59

解决方案2 3 已采纳 2013-05-10 00:29:24

解决方案1
3 2013-05-09 20:29:59

解决方案2
3 已采纳 2013-05-10 00:29:24