[英]python decoding Non-English to use as url?
I have a variable such as title
: 我有一个变量,例如
title
:
title = "révolution_essentielle"
I could encode and decode it like this for other purposes: 我可以像这样将其编码和解码以用于其他目的:
title1 = unicode(title, encoding = "utf-8")
But how do I preserve the Non-English and use it as part of a url string to access the url? 但是,如何保存非英语并将其用作访问URL的URL字符串的一部分? For instance, I ideally want
https://mainurl.com/révolution_essentielle.html
by concatenating several strings including title
like this: 例如,理想情况下,我希望通过串联包括
title
这样的几个字符串来https://mainurl.com/révolution_essentielle.html
:
url = main_url + "/" + title + ".html"
Could anyone kindly show me how to do that? 谁能告诉我该怎么做? Thanks a bunch!
谢谢一群!
To summarize what we've talked about in the comments: there is a function for quoting URLs (replacing special characters with %
prefix escape sequences. 总结一下我们在评论中讨论的内容:有一个引用URL的功能(用
%
前缀转义序列替换特殊字符。
For Python 2 (as used in this case), it's urllib.quote()
, which can be used as follows: 对于Python 2(在这种情况下使用的),它是
urllib.quote()
,可以按以下方式使用:
urllib.quote("révolution_essentielle")
When our input is an unicode
object with wide characters, we need to also encode it first, eg: 当我们的输入是带有宽字符的
unicode
对象时,我们还需要先对其进行编码,例如:
urllib.quote(u'hey_there_who_likes_lego_that\xe3\u019\xe2_\xe3_...'.encode('utf8')).
Be ware though so that your representation matches the one expected/understood by the counterpart machine. 但是要当心,以使您的表示与对方机器期望/理解的表示相匹配。
If we were talking Python 3, the equivalent function would be urllib.parse.quote()
: 如果我们在谈论Python 3,则等效函数将是
urllib.parse.quote()
:
urllib.parse.quote("révolution_essentielle")
Which can chew over str
(unicode) parameters as well as encoded value in bytes
object. 它可以检查
str
(unicode)参数以及bytes
对象中的编码值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.