简体   繁体   English

按原样将字符串转换为字节 object

[英]Converting a string to a bytes object as is

How do I turn a string into a bytes object as is , ie without encoding it?如何将字符串按原样转换为字节 object,不对其进行编码? I can't use .encode() here, because it's corrupting my binary file after saving.我不能在这里使用.encode() ,因为它在保存后破坏了我的二进制文件。

filedata = pathlib.Path('file.bin').read_bytes()
# since i can't modify a bytes object, i should convert it to a string, should I?
data = ''
for i in filedata:
    data += chr(i) if isinstance(i, int) else i
data[3] = '\x01'
data += '\x58\x02\x0C\x80\x61\x39\x56\x18\x55\x61\x89\x42\x42\x16\x46\x17\x54\x70\x10\x58\x60\x10\x10\x01\x75\x10\xF0\xC0\x00\x01\x00\x02\x00\xC0\x00\xD0\x00\x01\x00\xC4\x00\x01\x00\x02\x00\x01\x00\x00\x02\x00\x00\x00'
pathlib.Path('result.bin').write_bytes(data.encode()) # doesn't work as it should

So instead of this:所以不是这个:

58 02 0C 80 61 39 56 18 55 61 89 42 42 16 46 17 54 70 10 58 60 10 10 01 75 10 F0 C0 00 01 00 02 00 C0 00 D0 00 01 00 C4 00 01 00 02 00 01 00 00 02 00 00 00

I get this:我明白了:

58 02 0C C2 80 61 39 56 18 55 61 C2 89 42 42 16 46 17 54 70 10 58 60 10 10 01 75 10 C3 B0 C3 80 00 01 00 02 00 C3 80 00 C3 90 00 01 00 C3 84 00 01 00 02 00 01 00 00 02 00 00 00

I tried modifying a bytes object itself, but I'm always getting that error:我尝试修改字节 object 本身,但我总是收到该错误:

TypeError: 'bytes' object does not support item assignment TypeError: 'bytes' object 不支持项目分配

How do I turn a string into a bytes object AS IS, ie without encoding it?如何按原样将字符串转换为字节对象,即不对其进行编码?

You can't.你不能。 That's a contradiction of terms — as of Python 3.这是术语的矛盾——从 Python 3 开始。

A string is a sequence of text characters.字符串是文本字符序列。 Think letters, punctuation, white-space, even control characters.想想字母、标点符号、空格,甚至控制字符。 A bytes object is a sequence of 8-bit numbers.字节对象是一个 8 位数字序列。 How the two sequences are related is a question of encoding.这两个序列如何相关是一个编码问题。 There is no way around it.没有其他办法了。

Text characters should be thought of as abstract entities.文本字符应被视为抽象实体。 The letter A, for example, simply exists.例如,字母 A 就存在。 There is no number associated with it per se .没有与它相关联的数字本身 (Internally, it is represented by a Unicode code point, which is a number, but that's an implementation detail.) (在内部,它由一个 Unicode 代码点表示,它一个数字,但这是一个实现细节。)

In the code above, you're reading bytes and you're writing bytes, and in between you want to manipulate the byte stream: change one of the numbers, append others.在上面的代码中,您正在读取字节和写入字节,在两者之间您想要操作字节流:更改其中一个数字,附加其他数字。

Python bytes are no different from str in that regard: they are both immutable types. Python bytes在这方面与str没有什么不同:它们都是不可变类型。 If you did the same as above but with a string, you'd get the same kind of error:如果您执行与上述相同但使用字符串的操作,则会出现相同类型的错误:

>>> s = 'abcd'
>>> s[3] = 'x'
TypeError: 'str' object does not support item assignment

That is, in-place character manipulation is not supported for strings.也就是说,字符串不支持就地字符操作。 There are other ways to achieve the same result though.但是,还有其他方法可以实现相同的结果。 In-place byte manipulation, on the other hand, is supported — arguably because it's a use case that is more common than for strings.就地字节操作,而另一方面,支持-可以说,因为它是一个用例比字符串更为常见。 You just need to usebytearray instead of bytes :您只需要使用bytearray而不是bytes

>>> data = bytearray(b'\x00\x01\x02\x03\x04')
>>> data[3] = 255
>>> print(data)
bytearray(b'\x00\x01\x02\xff\x04')

Which you can then write to a file without any encoding whatsoever:然后您可以将其写入文件而无需任何编码:

pathlib.Path('result.bin').write_bytes(data)

(Note that bytes literals must be prefixed with b .) (请注意, bytes文字必须以b为前缀。)

Solved (thanks to John):已解决(感谢 John):

filedata = bytearray(pathlib.Path(sys.argv[1]).read_bytes())
# filedata = bytearray(open(sys.argv[1], 'rb').read()) also works
filedata[1] = 255 # modifying a single byte (0 - 255)
filedata[0:1] = b'\xff' # inserting bytes
filedata.extend(255) # appending one single byte
filedata.extend(filedata2) # appending another array of bytes (bytearray object)
filedata.extend(b'\xff\xff') # appending bytes
filedata.extend([255, 255]) # appending bytes too
pathlib.Path(sys.argv[1]).write_bytes(filedata) # write data to a file
# open(sys.argv[1], 'rb').write(filedata) should work too

This was originally added to revision 5 of the question.这最初是添加到问题的修订版 5中的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM