簡體 English 中英

用於utf8編碼的字節串的unicode（）與str.decode（）（python 2.x）

[英]unicode() vs. str.decode() for a utf8 encoded byte string (python 2.x)

原文 2009-01-13 19:06:16 0 2 python/ unicode/ utf-8

有沒有理由更喜歡unicode(somestring, 'utf8')而不是somestring.decode('utf8') ？

我唯一的想法是.decode()是一個綁定方法，所以python可以更有效地解決它，但如果我錯了，請糾正我。

2 個解決方案

它很容易進行基准測試：

>>> from timeit import Timer
>>> ts = Timer("s.decode('utf-8')", "s = 'ééé'")
>>> ts.timeit()
8.9185450077056885
>>> tu = Timer("unicode(s, 'utf-8')", "s = 'ééé'") 
>>> tu.timeit()
2.7656929492950439
>>>

顯然， unicode()更快。

FWIW，我不知道你在哪里得到的方法會更快 - 這恰恰相反。

我更喜歡'something'.decode(...)因為在Python 3.0中unicode類型不再存在，而text = b'binarydata'.decode(encoding)仍然有效。

字節數組是 Java 中有效的 UTF8 編碼字符串，但在 Python 中不是

[英]Byte array is a valid UTF8 encoded String in Java but not in Python

相當於str.decode（'string_escape'）

[英]Equivalent for str.decode('string_escape')

Python 2.x字符串：Unicode與字節

[英]Python 2.x Strings: Unicode vs. Bytes

python3將str解碼為utf8

[英]python3 decode str to utf8

在python中解碼一個utf8字符串

[英]Decode a utf8 string in python

將Python的3字節字符串轉換為`str（utf8_encoded_str）`返回unicode

[英]Converting Python 3 String of Bytes of Unicode - `str(utf8_encoded_str)` back to unicode

Python utf8編解碼器無法解碼位置103的字節0x80：無效的起始字節

[英]Python utf8 codec can't decode byte 0x80 in position 103:invalid start byte

在 python 2.X 中混合 unicode 和 str … 問題？

[英]Mixing unicode and str in python 2.X … problems?

字節字符串與 Unicode 字符串。 Python

[英]byte string vs. unicode string. Python

Python將編碼字符串轉換為utf8？

[英]Python converting encoded string to utf8?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 字節數組是 Java 中有效的 UTF8 編碼字符串，但在 Python 中不是相當於str.decode（'string_escape'） Python 2.x字符串：Unicode與字節 python3將str解碼為utf8 在python中解碼一個utf8字符串將Python的3字節字符串轉換為`str（utf8_encoded_str）`返回unicode Python utf8編解碼器無法解碼位置103的字節0x80：無效的起始字節在 python 2.X 中混合 unicode 和 str … 問題？字節字符串與 Unicode 字符串。 Python Python將編碼字符串轉換為utf8？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM