在 python 中解碼 utf-8

Question

我有一個這樣的表達式，它產生 utf-8 表示的字節列表。

list(chr(number).encode("utf-8"))

但是如何反過來呢？

說，我有 2 個字節 [292, 200] 作為列表，如何將它們解碼為符號？

Answer 1

您可以在 0..255 范圍內的整數列表中調用bytes 。

因此，您的示例反轉如下：

>>> bytes([195, 136]).decode('utf8')
'È'

如果您想要代碼點，請將其包裝在ord()中：

>>> ord(bytes([195, 136]).decode('utf8'))
200

注意：僅當字節序列對應於單個 Unicode 字符（代碼點）時，最后一步才有效。

Answer 2

您必須記住，char 只存儲 8 位：-128 到 127。因此，如果“數字”大於 char 限制，它將不起作用。

 number = 127 print(f"number: {number}") li = list(chr(number).encode("utf-8")) print(f"List of byte: {li}") dec = int.from_bytes(li, byteorder='big') print(f"Type dec: {type(dec)}") print(f"Value dec: {dec}")

 number = 128 print(f"number: {number}") li = list(chr(number).encode("utf-8")) print(f"List of byte: {li}") dec = int.from_bytes(li, byteorder='big') print(f"Type dec: {type(dec)}") print(f"Value dec: {dec}")

查看python 文檔以轉換值

在 python 中解碼 utf-8

問題描述

2 個解決方案

解決方案1
2 2020-05-09 12:02:03

解決方案2
1 已采納 2020-05-09 12:41:52

在 python 中解碼 utf-8

問題描述

2 個解決方案

解決方案1 2 2020-05-09 12:02:03

解決方案2 1 已采納 2020-05-09 12:41:52

解決方案1
2 2020-05-09 12:02:03

解決方案2
1 已采納 2020-05-09 12:41:52