[英]Using pandas, how to only read one sheet from Excel workbook without loading any of the other sheets?
This is a little different from similar questions to this.这与类似的问题有点不同。 I am using the FactSet plugin for Excel, if anyone's familiar.
如果有人熟悉,我正在使用 Excel 的 FactSet 插件。 I have a workbook with multiple tabs that use the FactSet plugin to pull in data.
我有一个带有多个选项卡的工作簿,这些选项卡使用 FactSet 插件来提取数据。 No issues with this.
这没有问题。
The problem is FactSet generates a hidden cache sheet for every Excel workbook that when read in causes loading any sheet of the Excel file by python to return an empty DataFrame.问题是 FactSet 为每个 Excel 工作簿生成一个隐藏的缓存工作表,当读入时会导致 python 加载 Excel 文件的任何工作表以返回一个空的 DataFrame。
I've tried two methods to get the data from the sheet, but both methods load the entire workbook causing the FactSet cache file to be read in.我尝试了两种方法来从工作表中获取数据,但这两种方法都会加载整个工作簿,从而导致读入 FactSet 缓存文件。
import pandas as pd
fname = 'facset_excel_file.xlsm'
# method 1
wb = pd.ExcelFile(fname)
symbols = pd.read_excel(wb, sheet_name='Symbols') # returns empty DataFrame
# method 2
pd.read_excel(fname, sheet_name='Symbols') # returns empty DataFrame
So my questions are how can I read in a single sheet without loading the entire workbook?所以我的问题是如何在不加载整个工作簿的情况下阅读单个工作表? Or is it possible to exclude a worksheet by name when loading a workbook?
或者是否可以在加载工作簿时按名称排除工作表?
You can use something like, the file can be read using the file name as string or an open file object:您可以使用类似的东西,可以使用文件名作为字符串或打开的文件对象读取文件:
df = pd.read_excel(open('facset_excel_file.xlsm', 'rb'), sheet_name='123')
Try it.尝试一下。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.