将pandas df写入csv时出现Unicode编码错误

Question

I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df. 我清理了400个excel文件并使用pandas将它们读入python并将所有原始数据附加到一个大df中。

Then when I try to export it to a csv: 然后，当我尝试将其导出到csv时：

df.to_csv("path",header=True,index=False)

I get this error: 我收到此错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128)

Can someone suggest a way to fix this and what it means? 有人可以建议一种方法来解决这个问题及其意义吗？

Thanks 谢谢

Answer 1

You have unicode values in your DataFrame. 您的DataFrame中有unicode值。 Files store bytes, which means all unicode have to be encoded into bytes before they can be stored in a file. 文件存储字节，这意味着所有unicode必须先编码为字节才能存储在文件中。 You have to specify an encoding, such as utf-8 . 您必须指定编码，例如utf-8 。 For example, 例如，

df.to_csv('path', header=True, index=False, encoding='utf-8')

If you don't specify an encoding, then the encoding used by df.to_csv defaults to ascii in Python2, or utf-8 in Python3. 如果未指定编码，则df.to_csv使用的编码默认为df.to_csv中的ascii或Python3中的utf-8 。

Answer 2

Adding an answer to help myself google it later: 添加答案以帮助自己稍后谷歌搜索：

One trick that helped me is to encode a problematic series first, then decode it back to utf-8. 帮助我的一个技巧是首先编码有问题的系列，然后将其解码回utf-8。 Like: 喜欢：

df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))

This would get the dataframe to print correctly too. 这样也可以正确打印数据帧。

将pandas df写入csv时出现Unicode编码错误

问题描述

2 个解决方案

解决方案1
59 已采纳 2015-07-10 02:23:38

解决方案2
10 2016-03-17 06:12:34

将pandas df写入csv时出现Unicode编码错误

问题描述

2 个解决方案

解决方案1 59 已采纳 2015-07-10 02:23:38

解决方案2 10 2016-03-17 06:12:34

解决方案1
59 已采纳 2015-07-10 02:23:38

解决方案2
10 2016-03-17 06:12:34