简体   繁体   English

Python 将 UTF-16 和 UTF-8(?) 的混合转换为常规字符串

[英]Python convert mix from UTF-16 and UTF-8(?) to regular string

I have bytes (from requests.get) like this:我有这样的字节(来自 requests.get):

<th class=\"app_result_head\">\u0414\u043e\u043b\u0436\u043d\u0438\u043a<\/th>

How do I convert this to proper python string like this?:如何将其转换为这样的正确 python 字符串?:

<th class="app_result_head">Должник</th>

my_bytes - 'bytes' in question. my_bytes - 有问题的“字节”。 As it turns out answer is rather simple.事实证明,答案相当简单。

out = my_bytes.decode('raw_unicode_escape')
out = out.replace('\"', '"')
out = out.replace('\/', "/")

From docs for raw_unicode_escape:来自 raw_unicode_escape 的文档:

Latin-1 encoding with \uXXXX and \UXXXXXXXX for other code points.

This is exactly what I've needed这正是我所需要的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM