简体   繁体   English

python 使用 utf-16 编码和解码

[英]python encode and decode with utf-16

>>>> enbytes = b'\xdf\x81\x9e\xbf"Q\xa37\xd0\x7f\x18\x1d:J\xe2\xa1'

>>>> enbytes.decode('utf-16').encode('utf-16')

b'\xff\xfe\xdf\x81\x9e\xbf"Q\xa37\xd0\x7f\x18\x1d:J\xe2\xa1'

why enbytes does not equal encoding bytes.为什么enbytes不等于编码字节。 How to fix that.如何解决。

That's the byte-order mark (BOM), you can just use a specific endianessto prevent the addition of the BOM, so little endian:那是字节顺序标记(BOM),您可以只使用特定的字节序来防止添加 BOM,所以小字节序:

>>> enbytes.decode('utf-16').encode('utf-16-le')
b'\xdf\x81\x9e\xbf"Q\xa37\xd0\x7f\x18\x1d:J\xe2\xa1'
>>> enbytes == enbytes.decode('utf-16').encode('utf-16-le')
True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM