简体   繁体   中英

How can I unpack binary hex formatted data in Python?

Using the PHP pack() function, I have converted a string into a binary hex representation:

$string = md5(time); // 32 character length
$packed = pack('H*', $string);

The H* formatting means "Hex string, high nibble first".

To unpack this in PHP, I would simply use the unpack() function with the H* format flag.

How would I unpack this data in Python?

There's an easy way to do this with the binascii module:

>>> import binascii
>>> print binascii.hexlify("ABCZ")
'4142435a'
>>> print binascii.unhexlify("4142435a")
'ABCZ'

Unless I'm misunderstanding something about the nibble ordering (high-nibble first is the default… anything different is insane), that should be perfectly sufficient!

Furthermore, Python's hashlib.md5 objects have a hexdigest() method to automatically convert the MD5 digest to an ASCII hex string, so that this method isn't even necessary for MD5 digests. Hope that helps.

There's no corresponding "hex nibble" code for struct.pack, so you'll either need to manually pack into bytes first, like:

hex_string = 'abcdef12'

hexdigits = [int(x, 16) for x in hex_string]
data = ''.join(struct.pack('B', (high <<4) + low) 
               for high, low in zip(hexdigits[::2], hexdigits[1::2]))

Or better, you can just use the hex codec. ie.

>>> data = hex_string.decode('hex')
>>> data
'\xab\xcd\xef\x12'

To unpack, you can encode the result back to hex similarly

>>> data.encode('hex')
'abcdef12'

However, note that for your example, there's probably no need to take the round-trip through a hex representation at all when encoding. Just use the md5 binary digest directly. ie.

>>> x = md5.md5('some string')
>>> x.digest()
'Z\xc7I\xfb\xee\xc96\x07\xfc(\xd6f\xbe\x85\xe7:'

This is equivalent to your pack()ed representation. To get the hex representation, use the same unpack method above:

>>> x.digest().decode('hex')
'acbd18db4cc2f85cedef654fccc4a4d8'
>>> x.hexdigest()
'acbd18db4cc2f85cedef654fccc4a4d8'

[Edit]: Updated to use better method (hex codec)

In Python you use the struct module for this.

>>> from struct import *
>>> pack('hhl', 1, 2, 3)
'\x00\x01\x00\x02\x00\x00\x00\x03'
>>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
>>> calcsize('hhl')
8

HTH

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM