简体   繁体   中英

Analogs for perl pack/unpack (B*) functions for python?

I need to port code from perl that packs byte string . In perl it looks like the following:

pack 'B*', '0100001000111110010100101101000010010001'

I don't see B* format analog in python struct module . Perhaps there are ready solutions not to invent a bicycle?

Honestly, description is not clear for me, so i even can't imagine how it works to implement it by myself:

Likewise, the b and B formats pack a string that's that many bits long. Each such format generates 1 bit of the result. These are typically followed by a repeat count like B8 or B64 .

Each result bit is based on the least-significant bit of the corresponding input character, ie, on ord($char)%2. In particular, characters "0" and "1" generate bits 0 and 1, as do characters "\\000" and "\\001" .

Starting from the beginning of the input string, each 8-tuple of characters is converted to 1 character of output.

With format b , the first character of the 8-tuple determines the least-significant bit of a character; with format B , it determines the most-significant bit of a character.

If the length of the input string is not evenly divisible by 8, the remainder is packed as if the input string were padded by null characters at the end. Similarly during unpacking, "extra" bits are ignored.

If the input string is longer than needed, remaining characters are ignored.

A * for the repeat count uses all characters of the input field. On unpacking, bits are converted to a string of 0 s and 1 s.

So, string is divided in chunks for 8 symbols. If last chunk is less 8 symbols, it is padded with null characters in the end to be 8 symbols. Then, each chunk becomes a byte.

But i can't understand, what are resulting bits? What is meant under B8 and B64 here?

The int -object has a to_bytes -method:

binary = '0100001000111110010100101101000010010001'
number = int(binary, 2)
print(number.to_bytes((number.bit_length()+7)//8, 'big'))
# b'B>R\xd0\x91'

I'm not sure of the exact perl semantics, but here's my guess at them:

def pack_bit_string(bs):
    ret = b''
    while bs:
        chunk, bs = bs[:8], bs[8:]
        # convert to an integer so we can pack it
        i = int(chunk, 2)
        # Handle trailing chunks that are not 8 bits
        # Note this as an augmented assignment, perhaps also read as
        # i = i << (8 - len(chunk))
        i <<= 8 - len(chunk)
        ret += struct.pack('B', i)
    return ret

Comments are inline. If you know things like "the input is less than 64 bits" you can avoid the loop and use Q for struct.pack

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM