简体   繁体   English

Python str(u'a')和u'a'.encode('utf-8')有什么区别

[英]Python what's the difference between str(u'a') and u'a'.encode('utf-8')

As title, is there a reason not to use str() to cast unicode string to str?? 作为标题,是否有理由不使用str()将unicode字符串转换为str?

>>> str(u'a')
'a'
>>> str(u'a').__class__
<type 'str'>
>>> u'a'.encode('utf-8')
'a'
>>> u'a'.encode('utf-8').__class__
<type 'str'>
>>> u'a'.encode().__class__
<type 'str'>

UPDATE: thanks for the answer, also didn't know if I create a string using special character it will automatically convert to utf-8 更新:感谢您的回答,也不知道我是否使用特殊字符创建了一个字符串,它将自动转换为utf-8

>>> a = '€'
>>> a.__class__
<type 'str'>
>>> a
'\xe2\x82\xac'

Also is a Unicode object in python 3 也是python 3中的Unicode对象

When you write str(u'a') it converts the Unicode string to a bytestring using the default encoding which (unless you've gone to the trouble of changing it ) will be ASCII. 当您写str(u'a')它将使用默认编码将Unicode字符串转换为字节字符串(除非您麻烦更改它 )是ASCII。

The second version explicitly encodes the string as UTF-8. 第二个版本将字符串显式编码为UTF-8。

The difference is more apparent if you try with a string containing non-ASCII characters. 如果尝试使用包含非ASCII字符的字符串,则区别更加明显。 The second version will still work: 第二个版本仍然可以使用:

>>> u'€'.encode('utf-8')
'\xc2\x80'

The first version will give an exception: 第一个版本将给出一个例外:

>>> str(u'€')

Traceback (most recent call last):
  File "", line 1, in 
    str(u'€')
UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in position 0: ordinal not in range(128)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python utf-8字符支持中以`\\ U`和`\\ u`开头的unicode字符有什么区别 - what is the difference between unicode characters starting from `\U` and `\u` in python utf-8 characters support Python unicode字符串文字::'\ u0391'和u'\ u0391'之间的区别是什么 - Python unicode string literals :: what's the difference between '\u0391' and u'\u0391' python中u''前缀和unicode()有什么区别? - What is the difference between u' ' prefix and unicode() in python? 即使在 python3 中使用 encoding=utf-8 也无法编码字符 &#39;\ń&#39; - can't encode character '\u0144' even using encoding=utf-8 in python3 &#39;coding = utf8&#39;和&#39; - * - coding:utf-8 - * - &#39;之间有什么区别? - What's the difference between 'coding=utf8' and '-*- coding: utf-8 -*-'? 在Python 2.7x中调用u&#39;\\ 1234&#39;.decode(&#39;utf-8&#39;)是什么意思? - What is the meaning of calling u'\1234'.decode('utf-8') in Python 2.7x? u&#39;somestring&#39;和unicode(&#39;somestring&#39;),python 2.7有什么区别 - What's the difference of u'somestring' and unicode('somestring'), python 2.7 utf-8的python 2和python 3之间的区别 - Difference between python 2 and 3 for utf-8 curl -u和python请求之间有区别吗 - is there a difference between curl -u and python requests 有关unicode和utf-8编码的python中`%`-format运算符和`str.format()`之间有区别吗? - Is there a difference between `%`-format operator and `str.format()` in python regarding unicode and utf-8 encoding?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM