I have following problem:
I want to read from file into a raw binary string :
The file looks like this (with escape characters, not binary data):
\\xfc\\xe8\\x82\\x00\\x00\\x00\\x60\\x89\\xe5\\x31\\xc0\\x64\\x8b\\x50\\x30\\x8b\\x52
code used:
data = open("filename", "rb").read()
result obtained:
b"\\\\xfc\\\\xe8\\\\x82\\\\x00\\\\x00\\\\x00\\\\x60\\\\x89\\\\xe5\\\\x31\\\\xc0\\\\x64\\\\x8b\\\\x50\\\\x30\\\\x8b\\\\x52"
With dobule \\ .
How can I read it as binary string like : \\xaa
characters ? (Without escape characters)
This output is OK .
Python is outputting this data with double backslashes to show that it is non-printable . However, it's stored correctly, as bytes.
Ok. Your problem here is that you're asking the wrong question. Your data file isn't a raw binary string, it's an encoded one, encoded with escape characters. You're reading it as a raw binary, though, when you need instead to decode the escapes. Try
data = open("filename", "r", encoding='unicode_escape').read().encode('raw_unicode_escape')
instead.
Edit: ok, this now works. You need to encode into raw_unicode_escape, not utf-8 (the default).
To convert 4 ascii characters ( \\
x
f
c
) from file into a single byte ( 252==0xfc
), you could read ascii characters as bytes ( data = open("filename", "rb").read()
), remove \\x
prefix and convert the resulting hexadecimal bytestring into bytes
containing corresponding raw binary data:
>>> import binascii
>>> data = b'\\xfc\\xe8\\x82'
>>> binascii.unhexlify(data.replace(b'\\x', b''))
b'\xfc\xe8\x82'
It is best to avoid storing data as b'\\\\xfc'
(4 bytes) instead of b'\\xfc'
(1 byte) in the first place.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.