简体   繁体   English

URL UTF-8解码Python

[英]URL UTF-8 Decoding Python

I am having some data in URL format and I want to decode it using Python. 我有一些URL格式的数据,我想使用Python对其进行解码。 I tried the (accepted) answer here but I am still not getting getting the correct decoding. 我在这里尝试了(可接受的)答案但仍然无法获得正确的解码。 My code is as follows: 我的代码如下:

import urllib2

name = '%D0%BD%D0%BE%D1%82%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%82%D0%BE%D1%80-%D0%BE%D0%BB%D0%B8%D0%BC%D0%BF%D0%B8%D0%B9%D1%81%D0%BA%D0%B8%D1%85-%D0%B8'

print urllib2.unquote(urllib2.quote(name.encode("utf8"))).decode("utf8")

This should print нотификатор-олимпийских-и but it prints %D0%BD%D0%BE%D1%82%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%82%D0%BE%D1%80-%D0%BE%D0%BB%D0%B8%D0%BC%D0%BF%D0%B8%D0%B9%D1%81%D0%BA%D0%B8%D1%85-%D0%B8 这应该打印нотификатор-олимпийских-и但它会打印%D0%BD%D0%BE%D1%82%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%82%D0%BE%D1%80-%D0%BE%D0%BB%D0%B8%D0%BC%D0%BF%D0%B8%D0%B9%D1%81%D0%BA%D0%B8%D1%85-%D0%B8

so I tried unquoting it again 所以我尝试再次取消报价

print urllib2.unquote(urllib2.unquote(urllib2.quote(name.encode("utf8"))).decode("utf8")) 打印urllib2.unquote(urllib2.unquote(urllib2.quote(name.encode(“ utf8”)))。decode(“ utf8”))

but it gives me ноÑиÑикаÑоÑ-олимпийÑкиÑ-и 但它给了我ноÑиÑикаÑоÑ-олимпийÑкиÑ-и

I am not sure why this happens. 我不知道为什么会这样。 Can anyone please explain where am I doing wrong and how do I correct my mistake? 谁能解释我在哪里做错了以及如何纠正我的错误?

Too many quote/unquote operations: you get a UTF-8 string that is already URL-encoded, why are you UTF-8 and URL encoding it again? 太多的引用/取消引用操作:您得到的URL编码已经是UTF-8字符串,为什么还要再次对它进行UTF-8和URL编码呢?

unquoted = urllib.unquote(name)
print unquoted.decode('utf-8')
# нотификатор-олимпийских-и

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM