简体   繁体   English

Python:Pandas read_excel 无法打开.xls 文件,不支持 xlrd

[英]Python: Pandas read_excel cannot open .xls file, xlrd not supported

Problem:问题:

I am opening a.xls with pd.read_excel, but I got an error.我正在使用 pd.read_excel 打开 a.xls,但出现错误。

df_cima = pd.read_excel("docs/Presentaciones.xls")

xlrd.biffh.XLRDError: Excel xlsx file; not supported

The suffix of this file is.xls but this error tells me that it is.xlsx这个文件的后缀是.xls 但这个错误告诉我它是.xlsx

Then I tried to add engine="openpyxl" , which is usually used for reading the.xlsx when xlrd version is no longer 1.2.0, then it gives me another error然后我尝试添加engine="openpyxl" ,它通常用于在 xlrd 版本不再是 1.2.0 时读取.xlsx,然后它又给了我一个错误

openpyxl.utils.exceptions.InvalidFileException: openpyxl does not support the old .xls file format, please use xlrd to read this file, or convert it to the more recent .xlsx file format.

MY env:我的环境:

  • pandas version: 1.1.5 pandas 版本:1.1.5
  • xlrd version: 2.0.1 xlrd 版本:2.0.1
  • openpyxl version: 3.0.6 openpyxl 版本:3.0.6

I do not want to change my xlrd version back to 1.2.0 , from other answer I see that new version of xlrd support only.xls, but I don't understand why it is not working for my file.我不想将我的 xlrd 版本改回 1.2.0 ,从其他答案中我看到新版本的 xlrd 仅支持.xls,但我不明白为什么它不适用于我的文件。

Thanks in advance.提前致谢。

Your file is not a.xls, I insist: :)你的文件不是a.xls,我坚持::)

With Pandas 1.1.5 and xlrd 2.1.0使用 Pandas 1.1.5 和 xlrd 2.1.0

Rename Presentaciones.xls to Presentaciones.xlsx .Presentaciones.xls重命名为Presentaciones.xlsx

import pandas as pd
# Use openpyxl.
df = pd.read_excel(r'X:...\Presentaciones.xlsx', engine='openpyxl')
print(df)

Enjoy: :)享受: :)

More info更多信息

How do I know that your file is a fake .xls and a very real .xlsx ?我怎么知道您的文件是假的.xls和非常真实的.xlsx Because openpyxl doesn't work with xls files.因为openpyxl不适用于xls文件。

import pandas as pd
df = pd.read_excel(r'X:...\test.xls', engine='openpyxl')
/* 
   ERROR:
   InvalidFileException: openpyxl does not support the old .xls file format, 
   please use xlrd to read this file, or convert it to the more recent .xlsx file format.
*/

And trying to simply rename test.xls to test.xlsx does not work either!并且尝试简单地将test.xls重命名为test.xlsx也不起作用!

import pandas as pd
df = pd.read_excel(r'X:...\test.xlsx', engine='openpyxl')
/*
    Error:
    OSError: File contains no valid workbook part
*/

History历史

Beware, the .xlsx extension (detected by pandas) means there may be scripts in this file.请注意, .xlsx扩展名(由 pandas 检测到)意味着此文件中可能有脚本。 Sometimes the extension can lie, so be careful !有时扩展可能会撒谎,所以要小心

The reason why panda stopped supporting xlsx files is that those files are a security hazard and no one was maintaining this part of the code. panda 停止支持xlsx文件的原因是这些文件存在安全隐患,没有人维护这部分代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM