简体   繁体   中英

Python Version for Ruby's array.pack() and unpack()?

In Ruby, I could easily pack an array representing some sequence into a binary string:

# for int
# "S*!" directive means format for 16-bit int, and using native endianess

# 16-bit int, so each digit was represented by two bytes. "\x01\x00" and "\x02\x00"
# here the native endianess is "little endian", so you should 
# look at it backwards, "\x01\x00" becomes 0001, and "\x02\x00" becomes 0002

"\x01\x00\x02\x00".unpack("S!*")
# [1, 2]


# for hex
# "H*" means every element in the array is a digit for the hexstream

["037fea0651b358c361de"].pack("H*")
# "\x03\x7F\xEA\x06Q\xB3X\xC3a\xDE"

API doc for pack and unpack .

I couldn't find an uniform and equivalent way of transforming sequence to bytes (or vice versa) in python.

While struct provides methods for packing into bytes objects, the format string available has no option for hexstream.

EDIT: What I really want is something as versatile as Ruby's arr.pack and str.unpack, which supports multiple formatting and endianess control.

struct does only fixed-width encodings that correspond to a memory dump of something like a C struct. You want bytes.fromhex or binascii.unhexlify , depending on the source type (which is never a list).

After any such conversion, you can use struct.unpack on a byte string containing any number of “records” corresponding to the format string; each is decoded into an element of the returned tuple. The format string supports the usual integer sizes and endianness choices; it is of course possible to construct a format dynamically to do things like read a matrix whose dimensions are chosen at runtime:

mat=struct.unpack("%dd"%cols,buf)  # rows determined from len(buf)

It's also possible to construct a lower-memory array if the element type is primitive ; then you can follow up with byteswap as Alec A mentioned . NumPy offers similar facilities .

for a string in the utf-8 range it would be:

from binascii import unhexlify

strg = "464F4F"
unhexlify(strg).decode()  # FOO  (str)

if your content is just binary

strg = "037fea0651b358c361de"
unhexlify(strg) # b'\x03\x7f\xea\x06Q\xb3X\xc3a\xde'  (bytes)

also bytes.fromhex (as in Davis Herring's answer ) may be worth checking out.

Try memoryview.cast , which allows you to change the endianness of an array or byte object.

Storing values as arrays makes things easier, as you can use the byteswap function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM