简体   繁体   English

如何在Windows cmd上将不支持的unicode字符打印为“?”而不是引发异常?

[英]How to print unsupported unicode characters on Windows cmd as e.g. “?” instead of raising exception?

If a unicode character (code point) that is unsupported by Windows cmd, eg EN DASH "–" is printed with Python 3 in a Windows cmd terminal using: 如果Windows cmd不支持的Unicode字符(代码点)(例如EN DASH“ –” )在Windows cmd终端中使用Python 3打印,则使用:

print('\u2013')

Then an exception is raised: 然后引发异常:

UnicodeEncodeError: 'charmap' codec can't encode character '\–' in position 0: character maps to < undefined > UnicodeEncodeError:'charmap'编解码器无法在位置0编码字符'\\ u2013':字符映射到<undefined>

Is there a way to make print convert unsupported characters to eg "?", or otherwise handle the print to allow execution to continue ? 有没有一种方法可以使print将不支持的字符转换为“?”,或者以其他方式处理print以允许执行继续?

Update 更新资料

There is a better way... see below. 有更好的方法...请参阅下文。


There must be a better way, but this is all I can think of at the moment: 必须有一个更好的方法,但这就是我目前能想到的:

print('\u2013'.encode(errors='replace').decode())

This uses encode() to encode the unicode string to whatever your default encoding is, "replacing" characters that are not valid for that encoding with ? 它使用encode()将unicode字符串编码为您的默认编码,用?替换对于该编码无效的字符? . That converts the string to a bytes string, so that is then converted back to unicode, preserving the replaced characters. 将字符串转换为bytes字符串,然后将其转换回Unicode,并保留替换的字符。

Here is an example using a code point that is not valid in GBK encoding: 这是使用在GBK编码中无效的代码点的示例:

>>> s = 'abc\u3020def'
>>> print(s)
s.abc〠def
>>> s.encode(encoding='gbk')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'gbk' codec can't encode character '\u3020' in position 3: illegal multibyte sequence

>>> s.encode(encoding='gbk', errors='replace')
b'abc?def'
>>> s.encode(encoding='gbk', errors='replace').decode()
'abc?def'

>>> print(s.encode(encoding='gbk', errors='replace').decode())
abc?def

Update 更新资料

So there is a better way as mentioned by @eryksun in comments. 因此,@ eryksun在评论中提到了一种更好的方法。 Once set up there is no need to change any code to effect unsupported character replacement. 设置完成后,无需更改任何代码即可实现不受支持的字符替换。 The code below demonstrates before and after behaviour (I have set my preferred encoding to GBK): 下面的代码演示了行为之前和之后(我将首选编码设置为GBK):

>>> import os, sys
>>> print('\u3030')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'gbk' codec can't encode character '\u3030' in position 0: illegal multibyte sequence

>>> old_stdout = sys.stdout
>>> fd = os.dup(sys.stdout.fileno())
>>> sys.stdout = open(fd, mode='w', errors='replace')
>>> old_stdout.close()

>>> print('\u3030')
?

@eryksun comment mentions assigning Windows environment variable: @eryksun评论提到分配Windows环境变量:

PYTHONIOENCODING=:replace

Note the ":" before "replace". 注意“替换”之前的“:”。 This looks like a usable answer that does not require any changes in Python scripts using print . 这看起来像是一个有用的答案,不需要使用print在Python脚本中进行任何更改。

The print('\–') results in: print('\–')结果为:

?

and print('Hello\–world!') results in: print('Hello\–world!')结果为:

Hello?world! 你好,世界!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将带有unicode字符的字符串(例如→,∧,¬)转换为乳胶所示的字符串? - Convert string with unicode characters e.g. →,∧,¬ into strings illustrated in latex? 如何解决 flask 生产中的 Unicode 问题,例如 Ieeo? - How to solve Unicode problem in flask production e.g. Ieeo? Scrapy输出提供国际unicode字符(例如日语字符) - Scrapy output feed international unicode characters (e.g. Japanese chars) 打印 unicode 字符名称 - 例如 &#39;GREEK SMALL LETTER ALPHA&#39; - 而不是 &#39;α&#39; - Printing unicode character NAMES - e.g. 'GREEK SMALL LETTER ALPHA' - instead of 'α' C ++ - 如何使用C ++读取Unicode字符(例如,印地语脚本)或者是否有更好的方式通过其他编程语言? - C++ - How to read Unicode characters( Hindi Script for e.g. ) using C++ or is there a better Way through some other programming language? 如何使函数在同一行上多次打印,例如print(&#39;test&#39;)* 3 - How to make a function print many times on the same line, e.g. print('test')*3 如何遵循提示:改用可调用对象,例如,使用 `dict` 代替 `{}`? - How to follow HINT: Use a callable instead, e.g., use `dict` instead of `{}`? 如何定义自定义异常 class 只会在 KeyboardInterrupt 上打印“操作取消”而不是引发异常 - How can I define a custom exception class that will only print 'Operation Cancelled' on KeyboardInterrupt instead of raising exception 如何删除特定分隔符*内的字符串*中的特定字符,例如在括号内 - How to remove specific characters in a string *within specific delimiters*, e.g. within parentheses 如何在Python中使用2个小数点显示浮点数(例如,显示8.60而不是8.6) - How to make a float show up with 2 decimal points in Python (e.g. show 8.60 instead of 8.6)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM