繁体 English 中英

用于utf8编码的字节串的unicode（）与str.decode（）（python 2.x）

[英]unicode() vs. str.decode() for a utf8 encoded byte string (python 2.x)

原文 2009-01-13 19:06:16 1 2 python/ unicode/ utf-8

有没有理由更喜欢unicode(somestring, 'utf8')而不是somestring.decode('utf8') ？

我唯一的想法是.decode()是一个绑定方法，所以python可以更有效地解决它，但如果我错了，请纠正我。

2 个解决方案

它很容易进行基准测试：

>>> from timeit import Timer
>>> ts = Timer("s.decode('utf-8')", "s = 'ééé'")
>>> ts.timeit()
8.9185450077056885
>>> tu = Timer("unicode(s, 'utf-8')", "s = 'ééé'") 
>>> tu.timeit()
2.7656929492950439
>>>

显然， unicode()更快。

FWIW，我不知道你在哪里得到的方法会更快 - 这恰恰相反。

我更喜欢'something'.decode(...)因为在Python 3.0中unicode类型不再存在，而text = b'binarydata'.decode(encoding)仍然有效。

字节数组是 Java 中有效的 UTF8 编码字符串，但在 Python 中不是

[英]Byte array is a valid UTF8 encoded String in Java but not in Python

相当于str.decode（'string_escape'）

[英]Equivalent for str.decode('string_escape')

Python 2.x字符串：Unicode与字节

[英]Python 2.x Strings: Unicode vs. Bytes

python3将str解码为utf8

[英]python3 decode str to utf8

在python中解码一个utf8字符串

[英]Decode a utf8 string in python

将Python的3字节字符串转换为`str（utf8_encoded_str）`返回unicode

[英]Converting Python 3 String of Bytes of Unicode - `str(utf8_encoded_str)` back to unicode

Python utf8编解码器无法解码位置103的字节0x80：无效的起始字节

[英]Python utf8 codec can't decode byte 0x80 in position 103:invalid start byte

在 python 2.X 中混合 unicode 和 str … 问题？

[英]Mixing unicode and str in python 2.X … problems?

字节字符串与 Unicode 字符串。 Python

[英]byte string vs. unicode string. Python

Python将编码字符串转换为utf8？

[英]Python converting encoded string to utf8?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 字节数组是 Java 中有效的 UTF8 编码字符串，但在 Python 中不是相当于str.decode（'string_escape'） Python 2.x字符串：Unicode与字节 python3将str解码为utf8 在python中解码一个utf8字符串将Python的3字节字符串转换为`str（utf8_encoded_str）`返回unicode Python utf8编解码器无法解码位置103的字节0x80：无效的起始字节在 python 2.X 中混合 unicode 和 str … 问题？字节字符串与 Unicode 字符串。 Python Python将编码字符串转换为utf8？

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM