Python unicode字符串文字::'\ u0391'和u'\ u0391'之间的区别是什么

Question

I am using Python 2.7.3. 我使用的是Python 2.7.3。 Can anybody explain the difference between the literals: 任何人都可以解释文字之间的区别：

'\u0391'

and: 和：

u'\u0391'

and the different way they are echoed in the REPL below (especially the extra slash added to a1): 以及它们在下面的REPL中回显的不同方式（特别是添加到a1的额外斜杠）：

>>> a1='\u0391'
>>> a1
'\\u0391'
>>> type(a1)
<type 'str'>
>>> 
>>> a2=u'\u0391'
>>> a2
u'\u0391'
>>> type(a2)
<type 'unicode'>
>>>

Answer 1

You can only use unicode escapes ( \ꯍ ) in a unicode string literal. 您只能在unicode字符串文字中使用unicode转义\ꯍ （ \ꯍ ）。 They have no meaning in a byte string. 它们在字节字符串中没有意义。 A Python 2 Unicode literal ( u'some text' ) is a different type of Python object from a python byte string ( 'some text' ). Python 2 Unicode文字（ u'some text' ）是python字节字符串（ 'some text' ）中不同类型的Python对象。

It's like using \\t versus \\T ; 这就像使用\\t对\\T ; the former has meaning in python literals (it's interpreted as a tab character), the latter just means a backslash and a capital T (two characters). 前者在python文字中有意义（它被解释为制表符），后者只是反斜杠和大写字母T（两个字符）。

To help understand the difference between Unicode and byte strings, please do read the Python Unicode HOWTO ; 要帮助理解Unicode和字节字符串之间的区别，请阅读Python Unicode HOWTO ; I can also recommend the Joel Spolsky on Unicode article . 我也可以在Unicode文章上推荐Joel Spolsky 。

Note: in Python 3, the same differences apply, but 'some text' is a Unicode string literal, and b'some text' is the bytestring syntax. 注意：在Python 3中，同样的差异适用，但'some text'是Unicode字符串文字，而b'some text'是b'some text'语法。

Answer 2

As opposed to C, in Python a string can be enclosed in simple quotes ( ' ) as well as double quotes ( " ) -- leaving aside the triple-double quotes """ . 与C相反，在Python中，字符串可以用简单的引号（ ' ）和双引号（ " ）括起来 - 不包括三双引号""" 。

Thus, '\Α' is only a string containing the letters \\ , u , 0 , 3 , 9 and 1 . 因此， '\Α'是只包含字母串\\ ， u ， 0 ， 3 ， 9和1 。 When pretty printing this string, the \\ is escaped via another \\ . 当漂亮地打印这个字符串时， \\会通过另一个\\来转义。

On the contrary, having a u in front makes the string to be considered Unicode and all escapes are evaluated. 相反，在前面使用u使得字符串被视为Unicode并且评估所有转义。 Thus, u'\Α' is interpreted as "the Unicode string containing codepoint 0391 " which is different from the above. 因此， u'\Α'被解释为“包含代码点0391的Unicode字符串”，它与上述不同。

Python unicode字符串文字::'\ u0391'和u'\ u0391'之间的区别是什么

问题描述

2 个解决方案

解决方案1
6 已采纳 2013-01-28 09:56:12

解决方案2
2 2013-01-28 10:00:07

Python unicode字符串文字::'\ u0391'和u'\ u0391'之间的区别是什么

问题描述

2 个解决方案

解决方案1 6 已采纳 2013-01-28 09:56:12

解决方案2 2 2013-01-28 10:00:07

解决方案1
6 已采纳 2013-01-28 09:56:12

解决方案2
2 2013-01-28 10:00:07