如何使用python源文件的'coding'标头正确读取其内容？

Question

Python source files often come with a coding header similar to the following Python源文件通常带有类似于以下内容的coding标头

# -*- coding: iso-8859-1 -*-

How can I this line to properly parse the contents of such a file? 我如何在这行代码中正确解析此类文件的内容？ Is there a better way than manually opening the file in binary mode, reading one line, and checking if it contains the header? 有没有比以二进制模式手动打开文件，读取一行并检查它是否包含标题更好的方法？ Is there a library that does this? 有图书馆这样做吗？

Background: this comes in the context of fixing this bug , which crashes elpy when used in conjunction with python3 and importmagic. 背景：这是在修复此bug的背景下进行的，当与python3和importmagic结合使用时，elpy会崩溃。 The code that I'm trying to fix uses 我要修复的代码使用

with open(filename) as fd:
    success = subtree.index_source(filename, fd.read())

and crashes on non-utf-8 files. 并在非utf-8文件上崩溃。 Ideally I would like to keep changes to a minimum. 理想情况下，我希望将更改减到最少。

Answer 1

There is tokenize.open() that does exactly that: it opens a Python source file using the character encoding specified in the coding header ( encoding declaration ). 有tokenize.open()可以做到这一点：它使用coding标头（编码声明）中指定的字符编码打开Python源文件。

You could decode on-the-fly remote Python files too . 您也可以实时解码远程Python文件。

如何使用python源文件的'coding'标头正确读取其内容？

问题描述

1 个解决方案

解决方案1
1 2015-02-11 18:24:34

如何使用python源文件的&#39;coding&#39;标头正确读取其内容？

问题描述

1 个解决方案

解决方案1 1 2015-02-11 18:24:34

如何使用python源文件的'coding'标头正确读取其内容？

解决方案1
1 2015-02-11 18:24:34