简体   繁体   中英

decode python binary string but not ensure ascii symbols

I have a binary object:

b'{"node": "\\u041e\\u0431\\u043d\\u043e\\u0432\\u043b\\u0435\\u043d\\u0438\\u0435"}}'

and I want it to be printed in Unicode and not strictly using ASCII symbols.

There is a hacky way to do it:

decoded = string.decode()
parsed_to_dict = json.loads(decoded)
dumped = json.dumps(parsed_to_dict, ensure_ascii=False)
print(dumped)

>>> {"node": "Обновление"}

however the text will not always be parseable as JSON, so I need a simpler way.

Is there a way to print out my binary object (or a decoded Unicode string) as a non-ascii string without going trough parsing/dumping JSON?

For example, how to print this b'\\\О\\\б\\\н\\\о\\\в\\\л\\\е\\\н\\\и\\\е' as Обновление ?

A bytes string like

b'\\u041e\\u0431\\u043d\\u043e\\u0432\\u043b\\u0435\\u043d\\u0438\\u0435'

has been encoded using Unicode escape sequences. To convert it back into a proper Unicode string you simply need to specify the 'unicode-escape' codec:

data = b'\\u041e\\u0431\\u043d\\u043e\\u0432\\u043b\\u0435\\u043d\\u0438\\u0435'
out = data.decode('unicode-escape')
print(out)

output

Обновление

However, if data is already a Unicode string, then you first need to encode it to bytes. You can do that using the ascii codec, presuming data only contains ASCII characters. If it contains characters outside ASCII but within the range of \\x80 to \\xff you may be able to use the 'latin1' codec.

data = '\\u041e\\u0431\\u043d\\u043e\\u0432\\u043b\\u0435\\u043d\\u0438\\u0435'
out = data.encode('ascii').decode('unicode-escape')

This should work so long as all the escapes are valid (no single \\ ).

import ast
bytes_object = b'{"node": "\\u041e\\u0431\\u043d\\u043e\\u0432\\u043b\\u0435\\u043d\\u0438\\u0435"}}'

unicode_string = ast.literal_eval("'{}'".format(bytes_object.decode()))

output:

'{"node": "Обновление"}}'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM