Python：带有codecs.open的UnicodeEncodeError

Question

我正在尝试使用orgnode.py（从这里开始）来解析org文件。 这些文件是英语/波斯语，使用file -i似乎是utf-8编码的。 但是我在使用makelist函数（其本身使用带有utf-8的codec.open）时收到此错误：

>>> Orgnode.makelist("toread.org")
[**  [[http://www.apa.org/helpcenter/sexual-orientation.aspx][Sexual orientation, homosexuality and bisexuality]]            :ToRead:



Added:[2013-11-06 Wed]
, **  [[http://stackoverflow.com/questions/11384516/how-to-make-all-org-files-under-a-folder-added-in-agenda-list-automatically][emacs - How to make all org-files under a folder added in agenda-list automatically? - Stack Overflow]] 

(setq org-agenda-text-search-extra-files '(agenda-archives "~/org/subdir/textfile1.txt" "~/org/subdir/textfile1.txt"))
Added:[2013-07-23 Tue] 
, Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 63-66: ordinal not in range(128)

该函数返回组织标题列表，但显示最后一个错误（而不是用波斯语写的最后一项）。 有什么建议我该如何处理这个错误？

Answer 1

如回溯告诉您的那样，异常是由您在Python控制台本身（ Orgnode.makelist("toread.org") ）上输入的语句引发的，而不是在评估语句期间调用的函数之一中Orgnode.makelist("toread.org")的。

当解释器自动转换语句的返回值以将其显示回控制台时，这是典型的编码错误。 显示的文本是将内置的repr()应用于返回值的结果。

这里， makelist结果的repr()是一个unicode对象，解释器默认情况下会尝试使用"ascii"编解码器将其转换为str 。

罪魁祸首是Orgnode.__repr__方法（ https://github.com/albins/orgnode/blob/master/Orgnode.py#L592 ），该方法返回unicode对象（因为节点内容已通过codecs.open自动解码），尽管通常要求__repr__方法返回仅包含安全（ASCII）字符的字符串。

这是您可以对Orgnode进行的最小更改，以解决您的问题：

-- a/Orgnode.py
+++ b/Orgnode.py
@@ -612,4 +612,4 @@ class Orgnode(object):
 # following will output the text used to construct the object
         n = n + "\n" + self.body

-        return n
+        return n.encode('utf-8')

如果要使用仅返回ASCII字符的版本，则可以使用'string-escape'作为编解码器，而不是'utf-8' 。

这只是一个快速而肮脏的修复程序。 正确的解决方案是重写适当的__repr__方法，并添加此类缺少的__str__和__unicode__方法。 （如果有时间的话，我什至可以自己解决这个问题，因为我对使用Python代码操作Org模式文件非常感兴趣）

Python：带有codecs.open的UnicodeEncodeError

问题描述

1 个解决方案

解决方案1
0 已采纳 2014-03-25 13:15:30

Python：带有codecs.open的UnicodeEncodeError

问题描述

1 个解决方案

解决方案1 0 已采纳 2014-03-25 13:15:30

解决方案1
0 已采纳 2014-03-25 13:15:30