简体   繁体   中英

Appending data bytes to binary file with Python

I want to append a crc that I calculate to an existing binary file.

For example, the crc is 0x55667788.

I want to append 0x55, 0x66, 0x77 and 0x88 to the end of the file.

For example, if I open the file in HexEdit, the last four bytes of the file will show 0x55667788.

Here is my code so far:

fileopen = askopenfilename()
filename = open(fileopen, 'rb+')
filedata = filename.read()
filecrc32 = hex(binascii.crc32(filedata))
filename.seek(0,2)
filename.write(filecrc32)
filename.close()

I get the following error:

File "C:\Users\cjackel\openfile.py", line 9, in <module>
filename.write(filecrc32)
TypeError: 'str' does not support the buffer interface

Any suggestions?

The hex function returns a string . In this case, you've got a string of 10 hex characters representing your 4-byte number, like this:

'0x55667788'

In Python 2.x, you would be allowed to write this incorrect data to a binary file (it would show up as the 10 bytes 30 78 35 35 36 36 37 37 38 38 rather than the four bytes you want, 55 66 77 88 ). Python 3.x is smarter, and only allows you to write bytes (or bytearray or similar) to a binary file, not str .


What you want here is not the hex string, but the actual bytes.

The way you described the bytes you want is called big-endian order . On most computers, the "native" order is the opposite, little-endian, which would give you 0x88776655 instead of 0x55667788 .

In Python 3.2+, the simplest way to get that is the int.to_bytes method:

filecrc = binascii.crc32(filedata).to_bytes(4, byteorder='big', signed=False)

(The signed=False isn't really necessary, because it's the default, but it's a good way of making it explicit that you're definitely dealing with an unsigned 32-bit integer.)

If you're stuck with earlier versions, you can use the struct module:

filecrc = struct.pack('>I', binascii.crc32(filedata))

The > means big-endian, and the I means unsigned 4-byte integer. So, this returns the same thing. In either case, what you get is b'\\x55\\x66\\x77\\x88' (or, as Python would repr it, b'\\Ufw\\x88' ).


The error is a bit cryptic, because no novice is going to have any idea what "the buffer interface" is (especially since the 3.x documentation calls it the Buffer Protocol , and it's only documented as part of CPython's C extension API…), but effectively it means that you need a bytes-like object . Usually, this error will mean that you just forgot to encode your string to UTF-8 or some other encoding. But when you were trying to write actual binary data rather than encoded text, it's the same error.

You need to serialize the data. Serialization is the process of getting the relevant bytes from the whole number. In your case, your CRC is a 4-byte number. The individual 4 bytes can be retrieved to a list as below:

serialized_crc = [(filecrc32 >> 24) & 0xFF,(filecrc32 >> 16) & 0xFF,
                 (filecrc32 >> 8) & 0xFF,filecrc32 & 0xFF]

The CRC can be then written to the file by converting to a bytearray as below:

filename.write(bytearray(serialized_crc))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM