简体   繁体   English

UnicodeEncodeError:'ascii'编解码器无法编码字符错误

[英]UnicodeEncodeError: 'ascii' codec can't encode character error

I am reading some files from google cloud storage using python 我正在使用python从谷歌云存储中读取一些文件

spark = SparkSession.builder.appName('aggs').getOrCreate()

df = spark.read.option("sep","\t").option("encoding", "UTF-8").csv('gs://path/', inferSchema=True, header=True,encoding='utf-8')
df.count()
df.show(10)

However, I keep getting an error that complains about the df.show(10) line: 但是,我一直收到一个抱怨df.show(10)行的错误:

df.show(10)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 
350, in show
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 162: ordinal not in range(128)

I googled and found this seems to be a common error and the solution should be added in the encoding of "UTF-8" to the spark.read.option , as I already did. 我用spark.read.option搜索,发现这似乎是一个常见的错误,解决方案应该在"UTF-8"的编码中添加到spark.read.option ,就像我已经做的那样。 Since this doesn't help, I am still getting this error, could experts help? 由于这没有帮助,我仍然得到这个错误,专家能帮忙吗? Thanks in advance. 提前致谢。

How about exporting PYTHONIOENCODING before running your Spark job: 如何在运行Spark作业之前导出PYTHONIOENCODING

export PYTHONIOENCODING=utf8

For Python 3.7+ the following should also do the trick: 对于Python 3.7+ ,以下应该也可以做到:

sys.stdout.reconfigure(encoding='utf-8')

For Python 2.x you can use the following: 对于Python 2.x,您可以使用以下内容:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeEncodeError:'ascii'编解码器不能编码字符[...] - UnicodeEncodeError: 'ascii' codec can't encode character […] Python错误:UnicodeEncodeError:'ascii'编解码器无法编码字符 - Python error : UnicodeEncodeError: 'ascii' codec can't encode character UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xe4' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' Python3中的“ UnicodeEncodeError:'ascii'编解码器无法编码字符” - “UnicodeEncodeError: 'ascii' codec can't encode character” in Python3 UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xef' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xef' 收到UnicodeEncodeError的Python脚本:“ ascii”编解码器无法编码字符 - Python script receiving a UnicodeEncodeError: 'ascii' codec can't encode character UnicodeEncodeError: 'ascii' 编解码器无法在打印功能中编码字符 - UnicodeEncodeError: 'ascii' codec can't encode character in print function PySpark — UnicodeEncodeError: 'ascii' 编解码器无法编码字符 - PySpark — UnicodeEncodeError: 'ascii' codec can't encode character UnicodeEncodeError:'ascii'编解码器无法编码字符u'\\ xe9' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' UnicodeEncodeError:'ascii'编解码器无法编码字符u'\\ xa3' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3'
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM