简体   繁体   中英

Converting a string to a bytes object as is

How do I turn a string into a bytes object as is , ie without encoding it? I can't use .encode() here, because it's corrupting my binary file after saving.

filedata = pathlib.Path('file.bin').read_bytes()
# since i can't modify a bytes object, i should convert it to a string, should I?
data = ''
for i in filedata:
    data += chr(i) if isinstance(i, int) else i
data[3] = '\x01'
data += '\x58\x02\x0C\x80\x61\x39\x56\x18\x55\x61\x89\x42\x42\x16\x46\x17\x54\x70\x10\x58\x60\x10\x10\x01\x75\x10\xF0\xC0\x00\x01\x00\x02\x00\xC0\x00\xD0\x00\x01\x00\xC4\x00\x01\x00\x02\x00\x01\x00\x00\x02\x00\x00\x00'
pathlib.Path('result.bin').write_bytes(data.encode()) # doesn't work as it should

So instead of this:

58 02 0C 80 61 39 56 18 55 61 89 42 42 16 46 17 54 70 10 58 60 10 10 01 75 10 F0 C0 00 01 00 02 00 C0 00 D0 00 01 00 C4 00 01 00 02 00 01 00 00 02 00 00 00

I get this:

58 02 0C C2 80 61 39 56 18 55 61 C2 89 42 42 16 46 17 54 70 10 58 60 10 10 01 75 10 C3 B0 C3 80 00 01 00 02 00 C3 80 00 C3 90 00 01 00 C3 84 00 01 00 02 00 01 00 00 02 00 00 00

I tried modifying a bytes object itself, but I'm always getting that error:

TypeError: 'bytes' object does not support item assignment

How do I turn a string into a bytes object AS IS, ie without encoding it?

You can't. That's a contradiction of terms — as of Python 3.

A string is a sequence of text characters. Think letters, punctuation, white-space, even control characters. A bytes object is a sequence of 8-bit numbers. How the two sequences are related is a question of encoding. There is no way around it.

Text characters should be thought of as abstract entities. The letter A, for example, simply exists. There is no number associated with it per se . (Internally, it is represented by a Unicode code point, which is a number, but that's an implementation detail.)

In the code above, you're reading bytes and you're writing bytes, and in between you want to manipulate the byte stream: change one of the numbers, append others.

Python bytes are no different from str in that regard: they are both immutable types. If you did the same as above but with a string, you'd get the same kind of error:

>>> s = 'abcd'
>>> s[3] = 'x'
TypeError: 'str' object does not support item assignment

That is, in-place character manipulation is not supported for strings. There are other ways to achieve the same result though. In-place byte manipulation, on the other hand, is supported — arguably because it's a use case that is more common than for strings. You just need to usebytearray instead of bytes :

>>> data = bytearray(b'\x00\x01\x02\x03\x04')
>>> data[3] = 255
>>> print(data)
bytearray(b'\x00\x01\x02\xff\x04')

Which you can then write to a file without any encoding whatsoever:

pathlib.Path('result.bin').write_bytes(data)

(Note that bytes literals must be prefixed with b .)

Solved (thanks to John):

filedata = bytearray(pathlib.Path(sys.argv[1]).read_bytes())
# filedata = bytearray(open(sys.argv[1], 'rb').read()) also works
filedata[1] = 255 # modifying a single byte (0 - 255)
filedata[0:1] = b'\xff' # inserting bytes
filedata.extend(255) # appending one single byte
filedata.extend(filedata2) # appending another array of bytes (bytearray object)
filedata.extend(b'\xff\xff') # appending bytes
filedata.extend([255, 255]) # appending bytes too
pathlib.Path(sys.argv[1]).write_bytes(filedata) # write data to a file
# open(sys.argv[1], 'rb').write(filedata) should work too

This was originally added to revision 5 of the question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM