简体   繁体   English

IPython中存在UnicodeEncodeError,但不是标准REPL

[英]UnicodeEncodeError in IPython but not standard REPL

I'm reading in a file that contains Unicode characters using Python 3.6.3 . 我正在使用Python 3.6.3读取包含Unicode字符的文件。 In the standard Python REPL, I'm able to read the file with no problems by specifying UTF-8 encoding: 在标准的Python REPL中,通过指定UTF-8编码,我可以毫无问题地读取文件:

>>> with open("emoji.csv", encoding='utf-8') as f:
...     lines = f.readlines()
>>> lines
['this line has an emoji \U0001f644\n']

No problems there. 那里没有问题。 However, when I try the same in IPython 6.1.0, I get the following UnicodeEncodeError : 但是,当我在IPython 6.1.0中尝试相同操作时,得到以下UnicodeEncodeError

In [1]: with open('emoji.csv', encoding='utf-8') as f:
...:     lines = f.readlines()
...:

In [2]: lines
Out[2]: ---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-2-3fb162a4fe05> in <module>()
----> 1 lines

/opt/anaconda/lib/python3.6/site-packages/IPython/core/displayhook.py in __call__(self, result)
    259             self.fill_exec_result(result)
    260             if format_dict:
--> 261                 self.write_format_data(format_dict, md_dict)
    262                 self.log_output(format_dict)
    263             self.finish_displayhook()

/opt/anaconda/lib/python3.6/site-packages/IPython/core/displayhook.py in write_format_data(self, format_dict, md_dict)
    188                 result_repr = '\n' + result_repr
    189 
--> 190         print(result_repr)
    191 
    192     def update_user_ns(self, result):

UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f644' in position 24: ordinal not in range(128)

Similarly, if I try to simply encode and decode the Unicode character by itself, I get the same error: 类似地,如果我尝试自己简单地编码和解码Unicode字符,则会收到相同的错误:

In [1]: '\U0001f644'.encode('utf-8').decode('utf-8')
Out[1]: ---------------------------------------------------------------------------
...
...
UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f644' in position 1: ordinal not in range(128)

What is causing this, and how do I read this file in IPython? 是什么原因造成的?如何在IPython中读取此文件?

Edit : It seems this is a function of IPython using an ASCII encoding by default: 编辑 :似乎这是默认情况下使用ASCII编码的IPython的功能:

In [1]: from IPython.utils.encoding import get_stream_enc; import sys

In [2]: get_stream_enc(sys.stdout)
Out[2]: 'ANSI_X3.4-1968'

However, I don't see anything in the IPython documentation on how to change this. 但是,我在IPython文档中看不到有关如何更改此内容的任何内容。 Is this possible? 这可能吗?

This is due to my system using a POSIX locale. 这是由于我的系统使用POSIX语言环境。 Setting $PYTHONIOENCODING=UTF-8 resolved the issue by overriding the ASCII-based encoding IPython was using by default. 设置$PYTHONIOENCODING=UTF-8通过覆盖IPython默认使用的基于ASCII的编码,解决了该问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM