简体   繁体   English

如何处理注释中的特殊字符和python文件中的硬编码字符串?

[英]How to handle special characters in comments and hard coded strings in python file?

This question aims at the following two scenarios: 该问题针对以下两种情况:

  1. You want to add a string with special characters to a variable: 您想要将带有特殊字符的字符串添加到变量中:

    special_char_string = "äöüáèô"

  2. You want to allow special characters in comments. 您要允许注释中包含特殊字符。

    # This a comment with special characters in it: äöà etc.

At the moment I handle this this way: 目前,我是这样处理的:

# -*- encoding: utf-8 -*-
special_char_string = "äöüáèô".decode('utf8')
# This a comment with special characters in it: äöà etc.

Works fine. 工作正常。

Is this the recommended way? 这是推荐的方法吗? Or is there a better solution for this? 还是对此有更好的解决方案?

Python will check the first or second line for an emacs/vim-like encoding specification. Python将检查第一行或第二行,以获取类似emacs / vim的编码规范。

More precisely, the first or second line must match the regular expression "coding[:=]\\s*([-\\w.]+)". 更准确地说, 第一行或第二行必须匹配正则表达式“ coding [:=] \\ s *([-\\ w。] +)”。 The first group of this expression is then interpreted as encoding name. 然后将此表达式的第一组解释为编码名称。 If the encoding is unknown to Python, an error is raised during compilation. 如果Python未知编码,则在编译期间会引发错误。

Source: PEP 263 资料来源: PEP 263

(A BOM would also make Python interpret the source as UTF-8. (BOM还可以使Python将源解释为UTF-8。

I would recommend, you use this over .decode('utf8') 我建议您在.decode('utf8')

# -*- encoding: utf-8 -*-
special_char_string = u"äöüáèô"

In any case, special_char_string will then contain a unicode object, no longer a str . 无论如何, special_char_string将包含一个unicode对象,不再是str As you can see, they're both semantically equivalent: 如您所见,它们在语义上都是等效的:

>>> u"äöüáèô" == "äöüáèô".decode('utf8')
True

And the reverse: 相反:

>>> u"äöüáèô".encode('utf8')
'\xc3\xa4\xc3\xb6\xc3\xbc\xc3\xa1\xc3\xa8\xc3\xb4'
>>> "äöüáèô"
'\xc3\xa4\xc3\xb6\xc3\xbc\xc3\xa1\xc3\xa8\xc3\xb4'

There is a technical difference, however: if you use u"something", it will instruct the parser that there is a unicode literal, it should be a bit faster. 但是,存在技术上的差异:如果您使用u“ something”,它将指示解析器存在unicode字面量,它应该快一些。

Yes, this is the recommended way for Python 2.x, see PEP 0263 . 是的,这是Python 2.x的推荐方法,请参阅PEP 0263 In Python 3.x and above, the default encoding is UTF-8 and not ASCII, so you don't need this there. 在Python 3.x及更高版本中,默认编码为UTF-8,而不是ASCII,因此您无需在此使用。 See PEP 3120 . 参见PEP 3120

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas 处理字符串中的特殊字符 - Python Pandas handle special characters in strings 从代码中获取所有硬编码路径,如何处理配置文件? - Getting all hard-coded paths out of the code, how to handle the config file? Python和终端 - 带有特殊字符的字符串 - Python and terminal - Strings with special characters 如何读取和请求包含 HTML 编码特殊字符的 XPath(Python、lxml、HTML) - How to read and request XPath with HTML-coded special characters in it (Python, lxml, HTML) 如何在python中读取带有特殊字符的文本文件 - How to read a text file with special characters in python 当没有可用于从数据库python和mysql检索的记录时,如何修复写入文件硬编码页眉/页脚 - How to fix write file hard coded headers/footer when no records available to retrieve from database python and mysql 在Python中从SPSS访问标签时如何处理特殊字符? - How to handle special characters when accessing labels from SPSS in Python? Python:如何比较字符串并忽略空格和特殊字符 - Python : How to compare strings and ignore white space and special characters 如何处理HTML中的特殊字符? - How to handle special characters in HTML? 如何在python中使用特殊字符填充和对齐unicode字符串? - How to pad and align unicode strings with special characters in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM