简体   繁体   English

Python 3:如何获得字节字符串的字符串文字表示?

[英]Python 3: How do I get a string literal representation of a byte string?

In Python 3, how do I interpolate a byte string into a regular string and get the same behavior as Python 2 (ie: get just the escape codes without the b prefix or double backslashes)? 在Python 3中,如何将字节字符串插入到常规字符串中并获得与Python 2相同的行为(即:只获取没有b前缀或双反斜杠的转义码)?

eg: 例如:

Python 2.7: Python 2.7:

>>> x = u'\u041c\u0438\u0440'.encode('utf-8')
>>> str(x)
'\xd0\x9c\xd0\xb8\xd1\x80'
>>> 'x = %s' % x
'x = \xd0\x9c\xd0\xb8\xd1\x80'

Python 3.3: Python 3.3:

>>> x = u'\u041c\u0438\u0440'.encode('utf-8')
>>> str(x)
"b'\\xd0\\x9c\\xd0\\xb8\\xd1\\x80'"
>>> 'x = %s' % x
"x = b'\\xd0\\x9c\\xd0\\xb8\\xd1\\x80'"

Note how with Python 3, I get the b prefix in my output and double underscores. 注意如何使用Python 3,我在输出中得到b前缀和双下划线。 The result that I would like to get is the result that I get in Python 2. 我想得到的结果是我在Python 2中获得的结果。

In Python 2 you have types str and unicode . 在Python 2中,您有strunicode类型。 str represents a simple byte string while unicode is a Unicode string. str表示简单的字节字符串,而unicode是Unicode字符串。

For Python 3, this changed: Now str is what was unicode in Python 2 and byte is what was str in Python 2. 对于Python 3,这改变了:现在str是Python 2中的unicode ,而byte是Python 2中的str

So when you do ("x = %s" % '\М\и\р').encode("utf-8") you can actually omit the u prefix, as it is implicit. 因此,当您执行("x = %s" % '\М\и\р').encode("utf-8")您实际上可以省略u前缀,因为它是隐式的。 Everything that is not explicitly converted in python is unicode. 未在python中显式转换的所有内容都是unicode。

This will yield your last line in Python 3: 这将产生Python 3中的最后一行:

 ("x = %s" % '\u041c\u0438\u0440').encode("utf-8")

Now how I encode after the final result, which is what you should always do: Take an incoming object, decode it to unicode (how ever you do that) and then, when making an output, encode it in the encoding of your choice. 现在我如何在最终结果之后进行编码,这是你应该经常做的事情:获取一个传入的对象,将其解码为unicode(如何做到这一点),然后在进行输出时,按照您选择的编码对其进行编码。 Don't try to handle raw byte strings. 不要尝试处理原始字节字符串。 That is just ugly and deprecated behaviour. 这只是丑陋和弃用的行为。

In your Python 3 example, you are interpolating into a Unicode string, not a byte string like you are doing in Python 2. 在Python 3示例中,您将插入到Unicode字符串中,而不是像Python 2中那样的字节字符串。

In Python 3, bytes do not support interpolation (string formatting or what-have-you). 在Python 3中, bytes不支持插值(字符串格式化或有什么用)。

Either concatenate, or use Unicode all through and only encode when you have interpolated: 要么连接,要么全部使用Unicode,只在插值时进行编码:

b'x = ' + x

or 要么

'x = {}'.format(x.decode('utf8')).encode('utf8')

or 要么

x = '\u041c\u0438\u0440'  # the u prefix is ignored in Python 3.3
'x = {}'.format(x).encode('utf8')

In Python 2, byte strings and regular strings are the same so there's no conversion done by str() . 在Python 2中,字节字符串和常规字符串是相同的,因此str()不进行转换。 In Python 3 a string is always a Unicode string, so str() of a byte string does a conversion. 在Python 3中,字符串始终是Unicode字符串,因此字节字符串的str()进行转换。

You can do your own conversion instead that does what you want: 您可以进行自己的转换,而不是按照自己的意愿行事:

x2 = ''.join(chr(c) for c in x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Python中获取字符串的原始表示? - How do I get the raw representation of a string in Python? 如何在Python3中获取PyObject的字符串表示? - How do I get string representation of PyObject in Python3? 如何在python中获取变量的字符串表示? - How do I get the string representation of a variable in python? Python 3,获取字符串的字符串文字表示形式的pythonic方法? - Python 3, pythonic way to get the string literal representation of bytes? 在Python中,如何通过变量名称的字符串表示形式获取变量? - In Python, how do I get a variable via a string representation of the name of the variable? 如何提供python类(而非实例)的非正式字符串表示形式 - How do I provide an informal string representation of a python Class (not instance) 在Python 2.7中,如何覆盖单个函数的字符串表示? - In Python 2.7, how do I override the string representation of a single function? 如何在Python中将UTF-16字节序列的字符串表示形式转换为UTF-8? - How do you convert a string representation of a UTF-16 byte sequence to UTF-8 in Python? 如何在上传之前获取图像的字符串表示形式? - How do I get a string representation of an image before uploading? 如果我只有Python 2.7中的字符串表示形式,如何获取此枚举 - How to get this enum if I only have a string representation in Python 2.7
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM