[英]How can I output unicode to the emacs Message buffer?
# -*- coding: utf-8 -*-
month = "März"
print month.decode("utf-8")
in the OS X terminal, I get the string März
just fine. 在OS X终端中,我得到的字符串
März
很好。
Also, my emacs (24.5 on OS X 10.10) seems to handle unicode (or at least umlauts) just fine, since I can see the umlaut in my emacs window. 另外,我的emacs(在OS X 10.10上为24.5)似乎可以很好地处理unicode(或至少是变音符号),因为我可以在emacs窗口中看到变音符号。
Yet when I run the code above directly from within emacs I get: 但是,当我直接从emacs中运行上面的代码时,我得到:
Traceback (most recent call last):
File "unicode-umlaut.py", line 3, in <module>
print month.decode("utf-8")
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 1: ordinal not in range(128)
What does this mean? 这是什么意思? Does it mean that even though emacs is handling a latin-1 character, the emacs Message buffer refuses to handle unicode?
这是否意味着即使emacs正在处理latin-1字符,但emacs消息缓冲区仍拒绝处理unicode? Is there a fix to make it possible to output non-ascii characters to the Message buffer in emacs?
是否有修复程序可以将非ASCII字符输出到emacs中的消息缓冲区?
Update: 更新:
Byte-wise the file looks (via emacs hexl-mode) like this: 按字节按字节显示文件(通过emacs hexl-mode),如下所示:
00000000: 2320 2d2a 2d20 636f 6469 6e67 3a20 7574 # -*- coding: ut
00000010: 662d 3820 2d2a 2d0a 6d6f 6e74 6820 3d20 f-8 -*-.month =
00000020: 224d c3a4 727a 220a 7072 696e 7420 6d6f "M..rz".print mo
00000030: 6e74 682e 6465 636f 6465 2822 7574 662d nth.decode("utf-
00000040: 3822 290a 8").
The c3a4 maps to a-umlaut (ä), and so the file seems to be properly coded in UTF-8. c3a4映射到a-变音符(ä),因此该文件似乎已以UTF-8正确编码。
This: 这个:
# -*- coding: utf-8 -*-
month = "März"
print month.decode("utf-8")
is more simply: 更简单地说:
# -*- coding: utf-8 -*-
month = u"März" # Use a Unicode string!
print month
#coding: utf8
declares the encoding of the source file , so make sure your editor is configured to save the file in that format. #coding: utf8
声明源文件的编码,因此请确保将您的编辑器配置为以该格式保存文件。
The first way would break if run on a terminal not configured for UTF-8; 如果在未配置UTF-8的终端上运行,第一种方法会中断; the second will work on a terminal configured for any encoding that supports the
ä
character. 第二个将在配置为支持
ä
字符的任何编码的终端上工作。
The error message you've shown indicates month
is already Unicode, so Python 2 is trying to encode it with the default ascii
codec before decoding it back to Unicode with the utf8
codec. 您显示的错误消息表明
month
已经是Unicode,因此Python 2会尝试使用默认的ascii
编解码器对其进行编码,然后再使用utf8
编解码器将其解码回Unicode。 That implies you are not running the same code displayed above, since that code uses a byte string. 这意味着您没有运行上面显示的相同代码,因为该代码使用字节字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.