Python3在打印十六进制值时添加了额外的字节

Question

I have run into a strange difference between Python2 and Python3. 我遇到了Python2和Python3之间的一个奇怪的区别。 Printing the same list of characters yields an extra byte C2 when printed with Python3. 使用Python3打印时，打印相同的字符列表会产生额外的字节C2。 I would have expected the same behaviour. 我本来期望同样的行为。 Python2 behaves as I expected. Python2的行为与我预期的一样。 What am I missing here? 我在这里错过了什么？

$ python3 -c "print('\x30\xA0\x04\x08')" | xxd
0000000: 30c2 a004 080a     
$ python2 -c "print('\x30\xA0\x04\x08')" | xxd
0000000: 30a0 0408 0a

Answer 1

Python 3 strings are unicode, and on your platform unicode is printed using UTF-8 encoding. Python 3字符串是unicode，在您的平台上，unicode使用UTF-8编码打印。 The UTF-8 encoding for unicode character U+00A0 is 0xC2 0xA0, which is what you see. unicode字符U + 00A0的UTF-8编码是0xC2 0xA0，这是你看到的。

Python 2 strings are bytestrings, so they are output exactly. Python 2字符串是字节串，因此它们是完全输出的。

Answer 2

In Python 3 all string literals are unicode. 在Python 3中，所有字符串文字都是unicode。

\\A0 converted to UTF-8 is a no-break space : \\A0转换为UTF-8是一个no-break space ：

U+00A0 no-break space (HTML &#160 ; ·   ) Can be encoded in UTF-8 as C2 A0 U+00A0空格（HTML &#160 ; ·   ）可以用UTF-8编码为C2 A0

Try this: 试试这个：

$ python3 -c "import sys; sys.stdout.buffer.write(b'\x30\xA0\x04\x08')" | xxd
0000000: 30a0 0408                                0...

Python3在打印十六进制值时添加了额外的字节

问题描述

2 个解决方案

解决方案1
6 已采纳 2015-02-10 10:19:03

解决方案2
6 2015-02-10 10:19:43

Python3在打印十六进制值时添加了额外的字节

问题描述

2 个解决方案

解决方案1 6 已采纳 2015-02-10 10:19:03

解决方案2 6 2015-02-10 10:19:43

解决方案1
6 已采纳 2015-02-10 10:19:03

解决方案2
6 2015-02-10 10:19:43