[英]Python what's the difference between str(u'a') and u'a'.encode('utf-8')
As title, is there a reason not to use str() to cast unicode string to str?? 作为标题,是否有理由不使用str()将unicode字符串转换为str?
>>> str(u'a')
'a'
>>> str(u'a').__class__
<type 'str'>
>>> u'a'.encode('utf-8')
'a'
>>> u'a'.encode('utf-8').__class__
<type 'str'>
>>> u'a'.encode().__class__
<type 'str'>
UPDATE: thanks for the answer, also didn't know if I create a string using special character it will automatically convert to utf-8 更新:感谢您的回答,也不知道我是否使用特殊字符创建了一个字符串,它将自动转换为utf-8
>>> a = '€'
>>> a.__class__
<type 'str'>
>>> a
'\xe2\x82\xac'
Also is a Unicode object in python 3 也是python 3中的Unicode对象
When you write str(u'a')
it converts the Unicode string to a bytestring using the default encoding which (unless you've gone to the trouble of changing it ) will be ASCII. 当您写
str(u'a')
它将使用默认编码将Unicode字符串转换为字节字符串(除非您麻烦更改它 )是ASCII。
The second version explicitly encodes the string as UTF-8. 第二个版本将字符串显式编码为UTF-8。
The difference is more apparent if you try with a string containing non-ASCII characters. 如果尝试使用包含非ASCII字符的字符串,则区别更加明显。 The second version will still work:
第二个版本仍然可以使用:
>>> u'€'.encode('utf-8') '\xc2\x80'
The first version will give an exception: 第一个版本将给出一个例外:
>>> str(u'€') Traceback (most recent call last): File "", line 1, in str(u'€') UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.