简体   繁体   中英

Storing 'struct' data to binary file

I need to store a binary file with a 12 byte header composed of 4 fields. They are namely: sSamples (4-bytes integer), sSampPeriod (4-bytes integer), sSampSize (2-bytes integer), and finally sParmKind (2-bytes integer). I'm using 'struct' to my variables to the desired fields. Now that I have them defined separately, how could I merge them all to store the '12 bytes header'?

sSamples        = struct.pack('i', nSamples) # 4-bytes integer
sSampPeriod     = struct.pack('i', nSampPeriod) # 4-bytes integer
sSampSize       = struct.pack('H', nSampSize) # 2-bytes integer / unsigned short
sParmKind       = struct.pack('H', 9) # 2-bytes integer / unsigned short

In addition, I've a npVect float array of dimensionality D ( numpy.ndarray - float32). How could I store this vector in the same binary file, but after the header?

As Cody Brocious wrote, you can pack your entire header at once:

header = struct.pack('<iiHH', nSamples, nSampPeriod, nSampSize, nParmKind)

He also mentioned endianness, which is important if you want to pack your data so as to reliably unpack it on machines with different architectures. The < at the beginning of my format string specifies "pack this data using a little-endian convention".

As for the array, you'll have to pack its length in order to determine how many values to unpack when you read it again. Doing it all in one call:

flattened = npVect.ravel()  # get a 1-D array of numbers
arrSize = len(flattened)
# pack header, count of numbers, and numbers, all in one call
packed = struct.pack('<iiHHi%df' % arrSize,
    nSamples, nSampPeriod, nSampSize, nParmKind, arrSize, *flattened)

Depending on how big your array is likely to be, you could end up with a huge string representing the entire contents of your binary file, and you might want to look into alternatives to struct which don't require you to have the entire file in memory.

Unpacking:

fmt = '<iiHHi'
nSamples, nSampPeriod, nSampSize, nParmKind, arrSize = struct.unpack(fmt, packed)
# Use unpack_from to start reading after the packed header and count
flattened = struct.unpack_from('<%df' % arrSize, packed, struct.calcsize(fmt))
npVect = np.ndarray(flattened, dtype='float32').reshape(# your dimensions go here
    )

EDIT : Oops, the array format isn't quite as simple as that :) The general idea holds, though: flatten your array into a list of numbers using any method you like, pack the number of values, then pack each value. On the other side, read the array as a flat list, then impose whatever structure you need on it.

EDIT : Changed format strings to use repeat specifiers, rather than string multiplication. Thanks to John Machin for pointing it out.

EDIT : Added numpy code to flatten the array before packing and reconstruct it after unpacking.

struct.pack returns a string, so you can combine the fields simply by string concatenation:

header = sSamples + sSampPeriod + sSampSize + sParmKind
assert len( header ) == 12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM