简体   繁体   中英

Convert string of 0s and 1s to byte in Python

I have a string representation of binary integers and I need bytes having the exact bit structure, to send over the sockets.

For eg if I have a string of length 16 : 0000111100001010 then I need 2 bytes of same bit structure. In this case, the first byte should have an int value of 15 and the second one as 10 . It doesn't matter if they can be printed in ascii format or not. How do I get it ?

I tried the following method which creates bytes in the form of 0xf0xa . But this is of size 6 bytes instead of 2.

def getByte(s):
  if(len(s) != 8):
    return
  b = b'0'
  for c in s:
    b = (int(b) | int(c)) & 0x0000ff #This makes b an integer
    b = b << 1
  b = b >> 1 #because of 1 extra shift
  b = hex(b).encode('utf-8') #how else can I get back to byte from int?

  return(b) 

This method takes a string of length 8 and intends to give a byte of the same internal bit structure, but fails. (I need something similar to strtol in C .)

Any help, please ?

First, if you have the bit string as a literal value, just make it a base-2 int literal, instead of a string literal:

value = 0b0000111100001010

If you have non-literal bit strings, and all you need to do is parse them into integers, then, as martineau says in a comment, the built-in int constructor is all you need, as martineau says, because it takes a base as an optional second argument:

value = int('0000111100001010', 2)

If you need to do anything fancy with bit strings, you'll probably want to use a third-party module like bitarray or bitstring , which let you create objects that can be treated as strings of 1s and 0s, sequences of booleans, integers, etc.:

value = bitstring.BitArray(bin='0000111100001010')

Once you have an integer, you can pack it into 2 bytes with struct , as martineau also explained in a comment:

my_bytes = struct.pack('!H', value)

The ! means "network-endian". If you want little-endian or native-endian (or big-endian, which is of course the same as network-endian, but might be a more meaningful way to describe some contexts), see Byte Order, Size, and Alignment . The H means to pack it as an C unsigned short —that is, two bytes.


But if you're using a third-party module, it probably has something simpler. For example, if you have a bitstring.BitArray from the previous example:

my_bytes = value.tobytes()

A simple way to convert binary string data like the one you have is to use the built-in int() function and tell it the number is in base 2 binary instead of the default base 10 decimal format:

int('0000111100001010', 2)

This will return a an integer value. To convert that into a string of bytes you can use the pack() function in the struct mode and tell it the data argument it a short (2-byte) unsigned integer by using a format string of 'H' :

struct.pack('!H', int('0000111100001010', 2))

Since you want to send this over a network socket, I also added a '!' prefix, which indicates that the bytes returned should be in "network" or big-endian byte-order rather than the native format of your computer (which might be different).

Note the string returned for the example will be '\\x0f\\n' . The '\\n' at the end is because the byte value 0x0a happens to be an ASCII newline character so Python represents them that way when it displays the repr() of a string that contains one (which is what the Python interactive console does after every expression automatically).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM