Json 返回特殊字符

Question

I'm getting as a return from an api a json with the characters "\\ u0083", "\\ u0087d" and "\\ u008d".我从 api 返回一个带有字符“\\ u0083”、“\\ u0087d”和“\\ u008d”的json。 I changed the encoding to utf-8 and ISO-8859-1 but I did not succeed, please someone could help in case, because the api that I am consuming will not be changed.我将编码更改为 utf-8 和 ISO-8859-1 但我没有成功，请有人帮忙以防万一，因为我正在使用的 api 不会更改。

Change in request header encoding, but unsuccessful更改请求头编码，但不成功

Examples:例子：

''' "prop": "SÃ\O LUÃ\S", "prop": "RUA LUIZ GUIMARÃ\ES", "prop": "POÃ\O DA PANELA" ''' "prop": "SÃ\O LUÃ\S", "prop": "RUA LUIZ GUIMARÃ\ES", "prop": "POÃ\O DA PANELA"

''' '''

Answer 1

You have UTF-8 bytes being decoded as ISO-8859-1.您将 UTF-8 字节解码为 ISO-8859-1。

'SÃO LUÍS' encoded as UTF-8 results in these bytes (the notation is Python, but the principles apply in any language): 'SÃO LUÍS' 编码为 UTF-8 导致这些字节（符号是 Python，但原则适用于任何语言）：

b'S\xc3\x83O LU\xc3\x8dS'

Decoding as ISO-8859-1 produces this string:解码为 ISO-8859-1 产生这个字符串：

'SÃ\x83O LUÃ\x8dS'

UTF-8 is a multi-byte encoding, but ISO-8859-1 is a single byte encoding. UTF-8 是多字节编码，而 ISO-8859-1 是单字节编码。 In this case the first bytes of UTF-8 encoded 'Ã' and 'Í' is \\xc3 , which is the ISO-8859-1 encoding for 'Ã'.在这种情况下，UTF-8 编码的 'Ã' 和 'Í' 的第一个字节是\\xc3 ，它是 'Ã' 的 ISO-8859-1 编码。 The second byte of each character is undefined in ISO-8859-1, so they are left unchanged by the decoding process.每个字符的第二个字节在 ISO-8859-1 中未定义，因此它们在解码过程中保持不变。

Assuming this corrupted data is generated by the API, you will need to iterate over the deserialised json data and encode each string as ISO-8859-1, then decode the resulting bytes as UTF-8.假设这个损坏的数据是由 API 生成的，您将需要遍历反序列化的 json 数据并将每个字符串编码为 ISO-8859-1，然后将结果字节解码为 UTF-8。

>>> bad = 'SÃ\u0083O LUÃ\u008dS'
>>> bad.encode('latin-1').decode('utf-8')
'SÃO LUÍS'

Json 返回特殊字符

问题描述

1 个解决方案

解决方案1
1 2019-06-28 17:45:31

Json 返回特殊字符

问题描述

1 个解决方案

解决方案1 1 2019-06-28 17:45:31

解决方案1
1 2019-06-28 17:45:31