I am trying to convert a bit string into a byte string, in Python 3.x. In each byte, bits are filled from high order to low order. The last byte is filled with zeros if necessary. The bit string is initially stored as a "collection" of booleans or integers (0 or 1), and I want to return a "collection" of integers in the range 0-255. By collection, I mean a list or a similar object, but not a character string: for example, the function below returns a generator.
So far, the fastest I am able to get is the following:
def bitsToBytes(a):
s = i = 0
for x in a:
s += s + x
i += 1
if i == 8:
yield s
s = i = 0
if i > 0:
yield s << (8 - i)
I have tried several alternatives: using enumerate, bulding a list instead of a generator, computing s by "(s << 1) | x" instead of the sum, and everything seems to be a bit slower. Since this solution is also one of the shortest and simplest I found, I am rather happy with it.
However, I would like to know if there is a faster solution. Especially, is there a library routine the would do the job much faster, preferably in the standard library?
Example input/output
[] -> []
[1] -> [128]
[1,1] -> [192]
[1,0,0,0,0,0,0,0,1] -> [128,128]
Here I show the examples with lists. Generators would be fine. However, string would not, and then it would be necessary to convert back and foth between list-like data an string.
The simplest tactics to consume bits in 8-er chunks and ignore exceptions:
def getbytes(bits):
done = False
while not done:
byte = 0
for _ in range(0, 8):
try:
bit = next(bits)
except StopIteration:
bit = 0
done = True
byte = (byte << 1) | bit
yield byte
Usage:
lst = [1,0,0,0,0,0,0,0,1]
for b in getbytes(iter(lst)):
print b
getbytes
is a generator and accepts a generator, that is, it works fine with large and potentially infinite streams.
Step 1: Add in buffer zeros
Step 2: Reverse bits since your endianness is reversed
Step 3: Concatenate into a single string
Step 4: Save off 8 bits at a time into an array
Step 5: ???
Step 6: Profit
def bitsToBytes(a):
a = [0] * (8 - len(a) % 8) + a # adding in extra 0 values to make a multiple of 8 bits
s = ''.join(str(x) for x in a)[::-1] # reverses and joins all bits
returnInts = []
for i in range(0,len(s),8):
returnInts.append(int(s[i:i+8],2)) # goes 8 bits at a time to save as ints
return returnInts
Using itertools
' grouper()` recipe :
from functools import reduce
from itertools import zip_longest
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
bytes = [reduce(lambda byte, bit: byte << 1 | bit, eight_bits)
for eight_bits in grouper(bits, 8, fillvalue=0)]
[] -> []
[1] -> [128]
[1, 1] -> [192]
[1, 0, 0, 0, 0, 0, 0, 0, 1] -> [128, 128]
If input is a string then a specialized solution might be faster:
>>> bits = '100000001'
>>> padded_bits = bits + '0' * (8 - len(bits) % 8)
>>> padded_bits
'1000000010000000'
>>> list(int(padded_bits, 2).to_bytes(len(padded_bits) // 8, 'big'))
[128, 128]
The last byte is zero if len(bits) % 8 == 0
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.