简体   繁体   English

Python 2 vs 3:从字节字符串中获取字节的结果一致

[英]Python 2 vs 3: consistent results with getting a byte from byte string

Is there any simple way to get consistent results in both Python 2 and Python 3 for operatioIn like "give me N-th byte in byte string"?是否有任何简单的方法可以在 Python 2 和 Python 3 中获得一致的结果,例如“给我字节字符串中的第 N 个字节”? Getting either byte-as-integer or byte-as-character will do for me, as long as that will be consistent.获取字节为整数或字节为字符对我来说都可以,只要它是一致的。

Ie given即给定

s = b"123"

Naïve approach yields:朴素的方法产生:

s[1] # => Python 2: '2', <type 'str'>
s[1] # => Python 3: 50, <class 'int'>

Wrapping that in ord(...) yields an error in Python 3:将其包装在ord(...)中会在 Python 3 中产生错误:

ord(s[1]) # => Python 2: 50, <type 'int'> 
ord(s[1]) # => Python 3: TypeError: ord() expected string of length 1, but int found

I can think of a fairly complicated compat solution:我可以想到一个相当复杂的兼容解决方案:

ord(s[1]) if (type(s[1]) == type("str")) else s[1] # 50 in both Python 2 and 3

... but may be there's an easier way which I just don't notice? ...但可能有一种我没有注意到的更简单的方法?

A length-1 slice will be also be a byte-sequence in either 2.x or 3.x:长度为 1 的切片也将是 2.x 或 3.x 中的字节序列:

s = b'123'
s[1:2] # 3.x: b'2'; 2.x: '2', which is the same thing but the repr() rules are different.

What about something like this?这样的事情呢?

import sys

if sys.version_info.major == 3:
    def index(s, n):
        return s[n]
elif sys.version_info.major == 2:
    def index(s, n):
        return ord(s[n])
else:
    raise NotImplementedError

If you use (converting if needed) the bytearray type, behavior will be indentical on both version, always matching Python 3 behavior.如果您使用(根据需要进行转换) bytearray类型,则两个版本的行为将相同,始终匹配 Python 3 行为。 That's because bytearray is actually a distinct type on Python 2 (with Python 3 behavior), where bytes is just an alias for str there.这是因为bytearray实际上是 Python 2 上的不同类型(具有 Python 3 行为),其中bytes只是str的别名。

The more typical solution would be to use the six compatibility library, which provides six.indexbytes , so on either version of Python, you could do:更典型的解决方案是使用提供six.indexbytessix兼容库,因此在 Python 的任一版本上,您都可以这样做:

>>> six.indexbytes(s, 1)
50

Prefix your string with u and you'll get consistency across Python versions.使用u为您的字符串添加前缀,您将在 Python 版本之间获得一致性。

# Python 2
>>> ord(u"123"[0])
49

# Python 3
>>> ord(u"123"[0])
49

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM