简体   繁体   English

在Python和UnicodeDecodeError中读取大型Excel文件:

[英]Reading large excel files in Python and UnicodeDecodeError:

I am new to Python and I'm trying to read a large excel file in python. 我是Python的新手,正在尝试在python中读取大型excel文件。 I converted my xlsx file to csv to work with pandas. 我将xlsx文件转换为csv以与熊猫一起使用。 I wrote the code below: 我写了下面的代码:

import pandas as pd
pd.read_csv('filepath.csv')
df = csv.parse("Sheet")
df.head()

But it gives this error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 28: character maps to <undefined> 但这会产生此错误: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 28: character maps to <undefined>

Can you please tell me why it gives this error? 您能告诉我为什么会出现此错误吗? Or do you have any advice to read large excel files? 还是您对读取大型Excel文件有任何建议? I also tried to work with openpyxl module, but I couldn't use read_only because of version of my Python.(I am using Python 2.7.8) 我也尝试使用openpyxl模块,但是由于我的Python版本导致我无法使用read_only (我正在使用Python 2.7.8)

Save the excel into Unicode Text File with Microsoft Excel. 使用Microsoft Excel将Excel保存到Unicode文本文件中。

在此处输入图片说明

Open the file with this line: 使用以下行打开文件:

df = pd.read_csv(filename,sep='\t',encoding='utf-16-le')
print(df.head())

Try with 试试看

pd.read_csv('filepath.csv',encoding ='utf-8')

There are many other encoding techniques like encoding = 'iso-8859-1' or encoding = 'cp1252' or encoding = 'latin1' . 还有许多其他编码技术,例如encoding = 'iso-8859-1'encoding = 'cp1252'encoding = 'latin1' You can choose as per your requirement. 您可以根据需要选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM