简体   繁体   English

Python str() function 返回字节串

[英]Python str() function returns byte string

As far as I can tell, Python's str() function should by default return an UTF8 encoded string.据我所知,Python 的 str() function 默认应该返回一个 UTF8 编码的字符串。 However, unless I specifically specify encoding as UTF8, I get a byte string.但是,除非我特别指定编码为 UTF8,否则我会得到一个字节字符串。 Should I set a global somewhere to make the default active, or what am I doing wrong?我应该在某处设置一个全局以使默认值处于活动状态,还是我做错了什么? Python 3.10.6 on Fedora 36/XFCE Fedora 36/XFCE 上的 Python 3.10.6

#!/usr/bin/python3

# Get the mount point of /dev/sd* mounts.
import subprocess

str2=subprocess.check_output(['cat', '/proc/mounts'])
mounts=str2.splitlines()

#print (mounts)

for x in range(len(mounts)):
    test = str(mounts[x], encoding="UTF8")
    if test[0:7] == '/dev/sd':
        print (test)

The above gives 'test' starting with /dev/sd/, bit if I omit the encoding, the string starts with b'/dev/.上面给出了以/dev/sd/开头的'test',如果我省略编码,则字符串以b'/dev/开头。

The Python 3 str always returns a Unicode string. Python 3 str始终返回 Unicode 字符串。 It is NOT an an encoded byte string.它不是一个编码的字节字符串。

The output of subprocess is a byte string.子进程的subprocess是一个字节串。 If you do str(b"1234") the result is a Unicode string that happens to contain the b prefix: "b'1234'" That's NOT a 4-byte byte string.如果您执行str(b"1234")结果是 Unicode 字符串恰好包含 b 前缀: "b'1234'"这不是 4 字节字节字符串。 That's a 7-byte Unicode string.那是一个 7 字节的 Unicode 字符串。

When you do str(b"1234", encoding="UTF8") that converts the byte string to Unicode, exactly like saying b"1234".decode('UTF8') , which is the usual way to write what you have written.当您执行str(b"1234", encoding="UTF8")将字节字符串转换为 Unicode 时,就像说b"1234".decode('UTF8')一样,这是编写您所写内容的常用方法.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM