[英]Cannot convert ascii to utf-8 in python
I have polish word "wąż" which means "snake" 我有波兰字“wąż”,意思是“蛇”
but I get it from webservice in ascii, so : 但是我是从ascii的webservice获取的,所以:
snake_in_polish_in_ascii="w\xc4\x85\xc5\xbc"
There are results of my trying: 我尝试的结果是:
print str(snake_in_polish_in_ascii) #this prints me w─ů┼╝
snake_in_polish_in_ascii.decode('utf-8')
print str(snake_in_polish_in_ascii) #this prints me w─ů┼╝ too
and this code: 和此代码:
print str(snake_in_polish_in_ascii.encode('utf-8'))
raises exception: 引发异常:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 1: ordinal not in range(128)
I'm using Wing Ide, at Windows Xp with polish culture. 我在Windows Xp上使用Wing Ide,具有波兰文化。
At top of file I have: 在文件顶部,我有:
# -*- coding: utf-8 -*-
I can't find a way to resolve it. 我找不到解决方法。 Why I can't get "wąż" in output?
为什么我的输出无法得到“wąż”?
This expression: 该表达式:
snake_in_polish_in_ascii.decode('utf-8')
don't change the string in place try like this: 不要在适当的位置更改字符串,如下所示:
print snake_in_polish_in_ascii.decode('utf-8')
About the reason of why when you do print snake_in_polish_in_ascii
you see w─ů┼╝
is because your terminal use the cp852 encoding (Central and Eastern Europe) try like this to see: 关于为什么当您
print snake_in_polish_in_ascii
时看到w─ů┼╝
的原因,是因为您的终端使用cp852编码(中欧和东欧),因此请尝试如下操作:
>>> print snake_in_polish_in_ascii.decode("cp852")
w─ů┼╝
>>> i="w\xc4\x85\xc5\xbc"
>>> print i.decode('utf-8')
wąż
Example: 例:
snake_in_polish_in_ascii = 'w\xc4\x85\xc5\xbc'
print snake_in_polish_in_ascii.decode('cp1252').encode('utf-8')
默认情况下,尽管python标准库仅使用ASCII,但python源文件仍被视为采用UTF8编码
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.