简体   繁体   English

如何在Windows控制台中显示UTF-8

[英]How to display utf-8 in windows console

I'm using Python 2.6 on Windows 7 我在Windows 7上使用Python 2.6

I borrowed some code from here: Python, Unicode, and the Windows console 我从这里借用了一些代码: Python,Unicode和Windows控制台

My goal is to be able to display uft-8 strings in the windows console. 我的目标是能够在Windows控制台中显示uft-8字符串。

Apparantly in python 2.6, the 在python 2.6中,

sys.setdefaultencoding() sys.setdefaultencoding()

is no longer supported 不再受支持

However, I wrote reload(sys) before I tried to use it and it magically didn't error. 但是,我在尝试使用reload(sys)之前就写了它,并且魔术没有出错。

This code will NOT error, but it shows funny characters instead of japanese text. 此代码不会出错,但是会显示有趣的字符而不是日语文本。 I believe the problem is because I have not successfully changed the codepage of the windows console. 我相信问题是因为我尚未成功更改Windows控制台的代码页。

These are my attempts, but they don't work: 这些是我的尝试,但是没有用:

reload(sys)
sys.setdefaultencoding('utf-8')

print os.popen('chcp 65001').read()

sys.stdout.encoding = 'cp65001'

Perhaps you can use win32console to change the codepage? 也许您可以使用win32console更改代码页? I tried the code from the website I linked, but it also errored from the win32console.. maybe that code is obsolete. 我从链接的网站尝试了代码,但是从win32console也出错了。也许该代码已过时。

Here's my code, that doesn't error but prints funny characters: 这是我的代码,不会出错,但会打印出有趣的字符:

#coding=<utf8>
import os
import sys
import codecs



reload(sys)
sys.setdefaultencoding('utf-8')
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)

#print os.popen('chcp 65001').read()
print(sys.stdout.encoding)
sys.stdout.encoding = 'cp65001'
print(sys.stdout.encoding)

x = raw_input('press enter to continue')

a = 'こんにちは世界'#.decode('utf8')
print a

x = raw_input()

I know you state you're using Python 2.6, but if you're able to use Python 3.3 you'll find that this is finally supported. 我知道您说您正在使用Python 2.6,但是如果您能够使用Python 3.3,您会发现它最终得到了支持。

Use the command chcp 65001 before starting Python. 启动Python之前,请使用命令chcp 65001

See http://docs.python.org/dev/whatsnew/3.3.html#codecs 参见http://docs.python.org/dev/whatsnew/3.3.html#codecs

In Python 3.6 it's no longer even necessary to use the chcp command, since Python bypasses the byte-level console interface entirely and uses a native Unicode interface instead. 在Python 3.6中,甚至不再需要使用chcp命令,因为Python完全绕过了字节级控制台接口,而是使用了本机Unicode接口。 See PEP 528: Change Windows console encoding to UTF-8 . 请参阅PEP 528:将Windows控制台编码更改为UTF-8

As noted in the comments by @mbom007, it's also important to make sure the console is configured with a font that supports the characters you're trying to display. 如@ mbom007的注释中所述,确保控制台配置有支持您要显示的字符的字体也很重要。

Never ever ever use setdefaultencoding . 永远不使用setdefaultencoding If you want to write unicode strings to stdio, encode them explicitly. 如果要将unicode字符串写入stdio,请对其进行显式编码。 Monkeying around with setdefaultencoding will cause stdlib modules and third-party modules alike to break in horrible subtle ways by allowing implicit conversion between str and unicode when it shouldn't happen. setdefaultencoding缠身将导致stdlib模块和第三方模块以可怕的微妙方式破坏,因为它们允许在strunicode不应该发生的隐式转换。

Yes, the problem is most likely that your code page isn't set properly. 是的,问题很可能是您的代码页设置不正确。 However, using os.popen won't change the code page; 但是,使用os.popen不会更改代码页。 it'll spawn a new shell, change its code page, and then immediately exit without affecting your console at all. 它会产生一个新的shell,更改代码页,然后立即退出而不影响您的控制台。 I'm not personally very familiar with windows, so I couldn't tell you how to change your console's code page from within your python program. 我个人对Windows不太熟悉,因此我无法告诉您如何从python程序中更改控制台的代码页。

The way to properly display unicode data via utf-8 from python, as mentioned before, is to explicitly encode your strings before printing them: print s.encode('utf-8') 如前所述,通过utf-8从python正确显示unicode数据的方法是在打印字符串之前对字符串进行显式编码: print s.encode('utf-8')

Changing the console code page is both unnecessary and won't work (in particular, setting it to 65001 runs into a Python bug ). 更改控制台代码页既没有必要,也行不通(特别是,将其设置为65001会导致Python错误 )。 See this question for details, and for how to print Unicode characters to the console regardless of the code page. 请参阅此问题,以获取详细信息以及有关如何将Unicode字符打印到控制台而不考虑代码页的信息。

Windows doesn't support UTF-8 in a console properly. Windows在控制台中不正确支持UTF-8。 The only way I know of to display Japanese in the console is by changing (on XP) Control Panel's Regional and Language Options, Advanced Tab, Language for non-Unicode Programs to Japanese. 我知道在控制台中显示日语的唯一方法是将(在XP上)“控制面板”的“区域和语言选项”,“高级”选项卡,“非Unicode程序的语言”更改为日语。 After rebooting, open a console and run "chcp" to find out the Japanese console's code page. 重新引导后,打开控制台并运行“ chcp”以查找日语控制台的代码页。 Then either print Unicode strings or byte strings explicitly encoded in the correct code page. 然后,打印在正确的代码页中显式编码的Unicode字符串或字节字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM