[英]Extracting bits from bytes
I am handling compressed data.我正在处理压缩数据。 The format contains a lookup table, and an array of long ints, which each may contain multiple values.
该格式包含一个查找表和一个长整数数组,每个长整数可能包含多个值。 Bit length of contained values varies depending on the file.
包含值的位长因文件而异。 I have access to these longs as
bytes
: is there an easy way to access a particular bit / bit range, or do I have to make one from scratch ?我可以访问这些 long 作为
bytes
:有没有一种简单的方法来访问特定的位/位范围,还是我必须从头开始制作? Within the standard library在标准库中
Note that I may have to continue to the next long value when the bit length isn't a factor of 64.请注意,当位长不是 64 的因数时,我可能必须继续使用下一个 long 值。
Theoretical example of what the code needs to do :代码需要做什么的理论示例:
4503672641818897L
4503672641818897L
0000000000010000000000000001000100000000000000000001000100010001
)0000000000010000000000000001000100000000000000000001000100010001
)5
bits this time)5
位)00100
)00100
)4
4
Here is a solution not depending on external libraries like numpy or on string conversions:这是一个不依赖于 numpy 等外部库或字符串转换的解决方案:
def get_bits(num, start, end, length=64):
'''Like bits(num)[from:to] interpreted as int'''
mask = 2**(end-start)-1
shift = length - (end-start) - start
return (num & (mask << shift)) >> shift
print(get_bits(17, 0, 3, length=6)) # 010001[0:3] -> 010 = 2
print(get_bits(17, 3, 6, length=6)) # 010001[3:6] -> 001 = 1
print(get_bits(17, 0, 6, length=6)) # 010001[0:6] -> 010001 = 17
print(get_bits(4503672641818897, 25, 30)) # ...[25:30] -> 00100 = 4
Explanation:解释:
mask = 2**(end-start)-1
: end-start
is the number of bits to select (N), then 2**N
is a one with N zeros (2**3 -> 1000). mask = 2**(end-start)-1
: end-start
是要选择的位数 (N),然后2**N
是一个有 N 个零 (2**3 -> 1000)。 2**N - 1
then is N ones (1000 - 1 = 111). 2**N - 1
那么是 N 个 (1000 - 1 = 111)。shift = length - (end-start) - start
: The number of bits we want to shift the mask to the left (111 << 3 = 111000) and also the number of bits we want the result to shift to the right: 010001 & 111000 is 010000, we only want the first three bits. shift = length - (end-start) - start
:我们希望将掩码向左移动的位数 (111 << 3 = 111000) 以及我们希望结果向右移动的位数:010001 & 111000 是 010000,我们只需要前三位。 010000 >> 3 is 010. return (num & (mask << shift)) >> shift
: Now we put it all together return (num & (mask << shift)) >> shift
:现在我们把它们放在一起试试这个功能unpackbits
从numpy
https://numpy.org/doc/stable/reference/generated/numpy.unpackbits.html
Bits = numpy.unpackbits(Bytes)
What about this solution:这个解决方案怎么样:
unpacked = "{0:b}".format(long_int)
unpacked = "0"*(64-len(unpacked)) + unpacked
int(unpacked[25:30],2)
EDIT : DOES NOT WORK !编辑:不起作用! the int constructor assumes a signed int, and there is no way to tell it to construct a uint
int 构造函数假定一个带符号的 int,并且无法告诉它构造一个 uint
Here's a hacky solution I found.这是我找到的一个hacky解决方案。 Seems very unwieldy though
虽然看起来很笨重
def bitValue(byteValue, start, length):
"""Extract length bits from byteValue at start, and return them as an integer"""
return int(bin(byteValue).lstrip('0b').rjust(64, '0')[start:start+length], 2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.