简体   繁体   中英

Convert from string containing hexadecimal characters to bytes in python 3

I have a string that contains printable and unprintable characters, for instance:

'\xe8\x00\x00\x00\x00\x60\xfc\xe8\x89\x00\x00\x00\x60\x89'

What's the most "pythonesque" way to convert this to a bytes object in Python 3, ie:

b'\xe8\x00\x00\x00\x00`\xfc\xe8\x89\x00\x00\x00`\x89'

If all your codepoints are within the range U+0000 to U+00FF, you can encode to Latin-1:

inputstring.encode('latin1')

as the first 255 codepoints of Unicode map one-to-one to bytes in the Latin-1 standard.

This is by far and away the fastest method, but won't work for any characters in the input string outside that range.

Basically, if you got Unicode that contains 'bytes' that should not have been decoded, encode to Latin-1 to get the original bytes again.

Demo:

>>> '\xe8\x00\x00\x00\x00\x60\xfc\xe8\x89\x00\x00\x00\x60\x89'.encode('latin1')
b'\xe8\x00\x00\x00\x00`\xfc\xe8\x89\x00\x00\x00`\x89'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM