简体   繁体   中英

How to read a binary file with multiple data types with a given structure

I've never dealt with a binary file with multiple data types in python. I was hoping I could get some direction. The binary file contains the following data types:

String
Byte
UInt8 -Size in bytes: 1- 8-bit unsigned integer.
UInt16 -Size in bytes: 2- Little-endian encoded 16 bit unsigned integer.
UInt32 -Size in bytes: 4- Little-endian encoded 32 bit unsigned integer.
UInt64 -Size in bytes: 8- Little-endian encoded 64 bit unsigned integer.

What I've been unsuccesful in doing is decoding my data properly. The data contains a common message format that serves as wrapper to deliver one or more higher level messages. I've provided below the field names contained in this wrapper.

Within this message I can have:
Length- Offset 0 - Size 2 - Type UInt16
Message Count - Offset 2 - Size 1- Type UInt8
ID - offset 3 - Size 1 - Type Byte
Sequence - offset 4 - Size 4 - Type UInt32
Payload- offset 8

Where the length specifies the length of the common message, the message count tells of how many higher level message will begin in the Payload.

The higher level message begins in Payload with the following characteristics

Message Length - 0 - Size 1 - Type UInt8
Message Type - offset 1 - Size 1 - type Byte

Once I'm able to figure out what the Message Types are in each higher level message the rest is trivial. I've been trying to create a class BinaryReader to do this for me and I haven't been able to be succesful use struct.unpack.

EDIT: This is an example of the common message
('7x\\xecM\\x00\\x00\\x00\\x00\\x15.\\x90\\xf1\\xc64CIDM')
and the higher level message inside it
('C\\x01dC\\x02H\\x00\\x15.\\xe8\\xf3\\xc64CIEN')

Construct is a great library for parsing binary data.


You might use it something like this:

from construct import *

message = Struct("wrapper",
    UBInt16("length"),
    UBInt8("count"),
    Byte("id"),
    UBInt32("sequence"),
    Array(lambda ctx: ctx.length,
        Struct("message",
            UBInt8("length"),
            UBInt8("type"),
            Bytes("content", lambda ctx: ctx.length),
        ),
    ),
)

I think you could use bitsrting module for Python http://code.google.com/p/python-bitstring/
It provides you with several nice feature including format strings for binary data.

Here you can find more about reading data and format strings.
http://pythonhosted.org/bitstring/reading.html#reading-using-format-strings
http://pythonhosted.org/bitstring/constbitstream.html#bitstring.ConstBitStream.read
http://pythonhosted.org/bitstring/constbitstream.html#bitstring.ConstBitStream.readlist

This code may give you an idea of a solution using bitstring.

from bitstring import BitStream
bs = BitStream(your_binary_data)

length, message_count, id, sequence = bs.readlist('uintle:16, uintle:8, bytes:1, uintle:32')
payload = bs[:bs.pos]
message_length, message_type = payload.readlist('uintle:8, bytes:1')
rest_of_data = payload[:payload.pos]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM