简体   繁体   中英

How to convert the string representation of a binary string froma text file back into the utf8 encoded text it came from?

I have a word in russian: "привет". It is encoded into utf-8 bytes using 'привет'.encode('utf-8') the result is python bytes object represented as:

b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'

Now I saved it inside a file and when I read that file I get this string: "b'\\\\xd0\\\\xbf\\\\xd1\\\\x80\\\\xd0\\\\xb8\\\\xd0\\\\xb2\\\\xd0\\\\xb5\\\\xd1\\\\x82'"

How do I decode this string into the original word?

It is not the bytes object I'm trying to decode but a string, so

"b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'".decode('utf-8') 

returns AttributeError: 'str' object has no attribute 'decode'

The way I save it to a file is simply by calling logger.info(x.encode('utf-8')) which is

import logging 
logger = logging.getLogger('GENERATOR_DYNAMICS')

and the way I read a file is

with open('file.log') as f:
    logs = f.readlines()

Your problems are two fold:

  • you got the stringrepresentation of a bytearray (from a file, but thats kindof irrelevant)
  • you want to get the bytearray back to utf8 text

So the solution is two steps as well:

import ast

# convert string representation back into binary
string_rep = "b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'"
as_binary = ast.literal_eval(string_rep)

# convert binary to utf8
text = as_binary.decode("utf8")
 

to get 'привет' again.

The last part is a duplicate of Python3: Decode UTF-8 bytes converted as string

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM