简体   繁体   English

在Python 3.5中编码utf-8和utf8之间的区别

[英]Difference between encoding utf-8 and utf8 in Python 3.5

What is the difference between encoding utf-8 and utf8 (if there is any)? 编码utf-8utf8有什么区别(如果有的话)?

Given the following example: 给出以下示例:

u = u'€'
print('utf-8', u.encode('utf-8'))
print('utf8 ', u.encode('utf8'))

It produces the following output: 它产生以下输出:

utf-8 b'\xe2\x82\xac'
utf8  b'\xe2\x82\xac'

There's no difference. 没有区别。 See the table of standard encodings . 请参阅标准编码表 Specifically for 'utf_8' , the following are all valid aliases: 特别是'utf_8' ,以下是所有有效的别名:

'U8', 'UTF', 'utf8'

Also note the statement in the first paragraph: 另请注意第一段中的陈述:

Notice that spelling alternatives that only differ in case or use a hyphen instead of an underscore are also valid aliases; 请注意,只有大小写或使用连字符而不是下划线的拼写替代方案也是有效的别名; therefore, eg 'utf-8' is a valid alias for the 'utf_8' codec 因此,例如'utf-8''utf_8'编解码器的有效别名

You can also check the aliases of a specific encoding using encodings module, this way, which will give you a Key matching aliases as values: 您还可以使用encodings模块检查特定编码的别名,这样可以为您提供密钥匹配别名作为值:

>>> from encodings.aliases import aliases
>>> 
>>> for k,v in aliases.items():
    if 'utf_8' in v:
        print('Encoding name:{:>10} -- Module Name: {:}'.format(k,v))


Encoding name:       utf -- Module Name: utf_8
Encoding name:        u8 -- Module Name: utf_8
Encoding name: utf8_ucs4 -- Module Name: utf_8
Encoding name: utf8_ucs2 -- Module Name: utf_8
Encoding name:      utf8 -- Module Name: utf_8

And as pointed by the mgilson 's answer: 正如mgilson的答案所指出的那样

Notice that spelling alternatives that only differ in case or use a hyphen instead of an underscore are also valid aliases; 请注意,只有大小写或使用连字符而不是下划线的拼写替代方案也是有效的别名; therefore, eg 'utf-8' is a valid alias for the 'utf_8' codec. 因此,例如'utf-8'是'utf_8'编解码器的有效别名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM