简体   繁体   中英

Struct.unpack and Length of Byte Object

I have the following code (data is a byte object):

    v = sum(struct.unpack('!%sH' % int(len(data)/2), data))

The part that confuses me is the %sH in the format string and the % int(len(data)/2

How exactly is this part of the code working? What is the length of a byte object? And what exactly is this taking the sum of?

Assuming you have a byte string data such as:

>>> data = b'\x01\x02\x03\x04'
>>> data
'\x01\x02\x03\x04'

The length is the number of bytes (or characters) in the byte string:

>>> len(data)
4

So this is equivalent to your code:

>>> import struct
>>> struct.unpack('!2H', data)
(258, 772)

This tells the struct module to use the following format characters :

  • ! - use network (big endian) mode
  • 2H - unpack 2 x unsigned shorts (16 bits each)

And it returns two integers which correspond to the data we supplied:

>>> '%04x' % 258
'0102'
>>> '%04x' % 772
'0304'

All your code does is automatically calculate the number of unsigned shorts on the fly

>>> struct.unpack('!%sH' % int(len(data)/2), data)
(258, 772)

But the int convesion is unnecessary, and it shouldn't really be using the %s placeholder as that is for string substitution:

>>> struct.unpack('!%dH' % (len(data)/2), data)
(258, 772)

So unpack returns two integers relating to the unpacking of 2 unsigned shorts from the data byte str. Sum then returns the sum of these:

>>> sum(struct.unpack('!%dH' % (len(data)/2), data))
1030

How your code works:

  • You are interpreting the byte structure of data
    • struct.unpack uses a string to determine the byte format of the data you want to interpret
    • Given the format stuct.unpack returns an iterable of the interpreted data.
  • You then sum the interable.

Byte Formatting

To interpret your data you are passing, you create a string to tell Python what form data comes in. Specifically the %sH part is a short hand for this number of unsigned shorts which you then format to say the exact number of unsigned short you want.

In this case the number is:

int(len(data) / 2)

because an unsigned short is normally 2 bytes wide.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM