简体   繁体   English

Python:从字节中提取位

[英]Python: Extracting bits from a byte

I'm reading a binary file in python and the documentation for the file format says:我正在用 python 读取二进制文件,文件格式的文档说:

Flag (in binary)Meaning标志(二进制)含义

1 nnn nnnn Indicates that there is one data byte to follow that is to be duplicated nnn nnnn (127 maximum) times. 1 nnn nnnn 表示有一个数据字节要复制 nnn nnnn(最多 127 次)次。

0 nnn nnnn Indicates that there are nnn nnnn bytes of image data to follow (127 bytes maximum) and that there are no duplications. 0 nnn nnnn 表示有 nnn nnnn 个字节的图像数据要跟随(最大 127 个字节)并且没有重复。

n 000 0000 End of line field. n 000 0000 行结束字段。 Indicates the end of a line record.表示行记录的结束。 The value of n may be either zero or one. n 的值可以是零或一。 Note that the end of line field is required and that it is reflected in the length of line record field mentioned above.请注意,行尾字段是必需的,它反映在上面提到的行记录字段的长度中。

When reading the file I'm expecting the byte I'm at to return 1 nnn nnnn where the nnn nnnn part should be 50.读取文件时,我期望我所在的字节返回1 nnn nnnn ,其中nnn nnnn部分应为 50。

I've been able to do this using the following:我已经能够使用以下方法做到这一点:

flag = byte >> 7
numbytes = int(bin(byte)[3:], 2)

But the numbytes calculation feels like a cheap workaround.但是 numbytes 计算感觉像是一种廉价的解决方法。

Can I do more bit math to accomplish the calculation of numbytes?我可以做更多的数学运算来完成 numbytes 的计算吗?

How would you approach this?你会如何处理这个问题?

The classic approach of checking whether a bit is set, is to use binary "and" operator, ie检查位是否设置的经典方法是使用二进制“与”运算符,即

x = 10 # 1010 in binary
if x & 0b10:  # explicitly: x & 0b0010 != 0
    print('First bit is set')

To check, whether n^th bit is set, use the power of two, or better bit shifting要检查是否设置了第 n^ 位,请使用 2 的幂或更好的位移位

def is_set(x, n):
    return x & 2 ** n != 0 

    # a more bitwise- and performance-friendly version:
    return x & 1 << n != 0

is_set(10, 1) # 1 i.e. first bit - as the count starts at 0-th bit
>>> True

You can strip off the leading bit using a mask ANDed with a byte from file.您可以使用掩码与文件中的字节进行 ANDed 去除前导位。 That will leave you with the value of the remaining bits:这将为您留下剩余位的值:

mask =  0b01111111
byte_from_file = 0b10101010
value = mask & byte_from_file
print bin(value)
>> 0b101010
print value
>> 42

I find the binary numbers easier to understand than hex when doing bit-masking.在进行位屏蔽时,我发现二进制数比十六进制数更容易理解。

EDIT: Slightly more complete example for your use case:编辑:您的用例稍微更完整的示例:

LEADING_BIT_MASK =  0b10000000
VALUE_MASK = 0b01111111

values = [0b10101010, 0b01010101, 0b0000000, 0b10000000]

for v in values:
    value = v & VALUE_MASK
    has_leading_bit = v & LEADING_BIT_MASK
    if value == 0:
        print "EOL"
    elif has_leading_bit:
        print "leading one", value
    elif not has_leading_bit:
        print "leading zero", value

If I read your description correctly:如果我正确阅读了您的描述:

if (byte & 0x80) != 0:
    num_bytes = byte & 0x7F

there you go:你去吧:

class ControlWord(object):
    """Helper class to deal with control words.

    Bit setting and checking methods are implemented.
    """
    def __init__(self, value = 0):
        self.value = int(value)
    def set_bit(self, bit):
        self.value |= bit
    def check_bit(self, bit):
        return self.value & bit != 0
    def clear_bit(self, bit):    
        self.value &= ~bit

而不是 int(bin(byte)[3:], 2),你可以简单地使用:int(bin(byte>>1),2)

not sure I got you correctly, but if I did, this should do the trick:不确定我是否正确理解了你,但如果我这样做了,这应该可以解决问题:

>>> x = 154 #just an example
>>> flag = x >> 1
>>> flag
1
>>> nb = x & 127
>>> nb
26

You can do it like this:你可以这样做:

def GetVal(b):
   # mask off the most significant bit, see if it's set
   flag = b & 0x80 == 0x80
   # then look at the lower 7 bits in the byte.
   count = b & 0x7f
   # return a tuple indicating the state of the high bit, and the 
   # remaining integer value without the high bit.
   return (flag, count)

>>> testVal = 50 + 0x80
>>> GetVal(testVal)
(True, 50)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM