简体   繁体   English

Python请求无效的URL标签错误

[英]Python Requests Invalid URL Label error

I'm trying to access Shopify's API which uses a URL format of - https://apikey:password@hostname/admin/resource.xml 我正在尝试访问使用URL格式的Shopify的API - https://apikey:password@hostname/admin/resource.xml

eg http://7ea7a2ff231f9f7:95c5e8091839609c864@iliketurtles.myshopify.com/admin/orders.xml 例如: http://7ea7a2ff231f9f7:95c5e8091839609c864@iliketurtles.myshopify.com/admin/orders.xml

doing $curl api_url downloads the correct XML however when I do $curl api_url下载正确的XML然而当我这样做

 import requests
 api_url = 'http://7ea7a2ff231f9f7d:95c5e8091839609c864@iliketurtles.myshopify.com/admin/orders.xml'
 r = requests.get(api_url) # Invalid url label error

Any idea why I'm getting this? 知道为什么我得到这个吗? Curl / opening the link directly in the browser is working fine. 直接在浏览器中滚动/打开链接工作正常。 Is it because the length of the URL is too long? 是因为URL的长度太长了吗?

Thanks! 谢谢!

It's not the length of the URL. 这不是URL的长度。 If I do: 如果我做:

import requests
test_url = 'http://www.google.com/?somereallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallyreallylongurl=true'
r = requests.get(test_url)

returns <Response [200]> 返回<Response [200]>

Have you tried making the request with the requests Authentication parameters detailed here 您是否尝试使用此处详细说明的请求验证参数发出请求

>>> requests.get('http://iliketurtles.myshopify.com/admin/orders.xml', auth=('ea7a2ff231f9f7', '95c5e8091839609c864'))
<Response [403]>

The error ( 'URL has an invalid label.' ) is probably a bug in requests library: it applies idna encoding (for internationalized domain names) on hostname with userinfo attached, source : 错误( 'URL has an invalid label.' )可能是requests库中的一个错误:它在附加了userinfo的主机名上应用了idna编码(对于国际化域名), 来源

netloc = netloc.encode('idna').decode('utf-8')

that might raise 'label empty or too long' error for the long username:password. 这可能会导致长用户名:密码的“标签为空或太长”错误。 You can try to report it on the requests' issue tracker . 您可以尝试在请求的问题跟踪器上报告它

a:b@example.com form is deprecated otherwise requests.get('https://a:b@example.com') should be equivalent to requests.get('https://example.com', auth=('a', 'b')) if all characters in username:password are from [-A-Za-z0-9._~!$&'()*+,;=] set. a:b@example.com表单已弃用,否则requests.get('https://a:b@example.com')应等同于requests.get('https://example.com', auth=('a', 'b')) requests.get('https://a:b@example.com') requests.get('https://example.com', auth=('a', 'b'))如果用户名:密码中的所有字符都来自[-A-Za-z0-9._~!$&'()*+,;=]设置。

curl and requests also differ then there are percent-encoded characters in userinfo eg, https://a:%C3%80@example.com leads to curl generating the following http header: curlrequests也不同,那么userinfo中有百分比编码的字符,例如https://a:%C3%80@example.com导致curl生成以下http标头:

Authorization: Basic YTrDgA==

but requests produces: requests产生:

Authorization: Basic YTolQzMlODA=

ie: 即:

>>> import base64
>>> base64.b64decode('YTrDgA==')
'a:\xc3\x80'
>>> print _
a:À
>>> base64.b64decode('YTolQzMlODA=')
'a:%C3%80'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM