简体   繁体   English

如何在Python中以与py2和py3一起使用的方式定义二进制字符串?

[英]How to define a binary string in Python in a way that works with both py2 and py3?

I am writing a module that is supposed to work in both Python 2 and 3 and I need to define a binary string. 我正在编写一个应该在Python 2和3中工作的模块,我需要定义一个二进制字符串。

Usually this would be something like data = b'abc' but this code code fails on Python 2.5 with invalid syntax. 通常这会像data = b'abc'但是这个代码在Python 2.5上失败,语法无效。

How can I write the above code in a way that will work in all versions of Python 2.5+ 如何以适用于所有Python 2.5+版本的方式编写上述代码

Note: this has to be binary (it can contain any kind of characters, 0xFF), this is very important. 注意:这必须是binary (它可以包含任何类型的字符,0xFF),这非常重要。

I would recommend the following: 我会推荐以下内容:

from six import b

That requires the six module , of course. 当然,这需要六个模块 If you don't want that, here's another version: 如果您不想这样,这是另一个版本:

import sys
if sys.version < '3':
    def b(x):
        return x
else:
    import codecs
    def b(x):
        return codecs.latin_1_encode(x)[0]

More info . 更多信息

These solutions (essentially the same) work, are clean, as fast as you are going to get, and can support all 256 byte values (which none of the other solutions here can). 这些解决方案(基本相同)工作,干净,速度与您将获得的一样快,并且可以支持所有256字节值(这里没有其他解决方案)。

If the string only has ASCII characters, call encode . 如果字符串只有ASCII字符,请调用encode This will give you a str in Python 2 (just like b'abc' ), and a bytes in Python 3: 这将为您提供Python 2中的str (就像b'abc' )和Python 3中的一个bytes

'abc'.encode('ascii')

If not, rather than putting binary data in the source, create a data file, open it with 'rb' and read from it. 如果没有,不是将二进制数据放在源中,而是创建一个数据文件,用'rb'打开它并从中读取。

You could store the data base64-encoded. 您可以存储base64编码的数据。

First step would be to transform into base64: 第一步是转换为base64:

>>> import base64
>>> base64.b64encode(b"\x80\xFF")
b'gP8='

This is to be done once, and using the b or not depends on the version of Python you use for it. 这将完成一次,使用b或不使用取决于您使用的Python版本。

In the second step, you put this byte string into a program without the b . 在第二步中,将此字节字符串放入没有b的程序中。 Then it is ensured that it works in py2 and py3. 然后确保它在py2和py3中工作。

import base64
x = 'gP8='
base64.b64decode(x.encode("latin1"))

gives you a str '\\x80\\xff' in 2.6 (should work in 2.5 as well) and a b'\\x80\\xff' in 3.x. 给你一个str '\\x80\\xff'在2.6(也应该在2.5中工作)和一个b'\\x80\\xff'在3.x.

Alternatively to the two steps above, you can do the same with hex data, you can do 除了上述两个步骤之外,您可以使用十六进制数据执行相同的操作

import binascii
x = '80FF'
binascii.unhexlify(x) # `bytes()` in 3.x, `str()` in 2.x

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM