简体   繁体   中英

Using pandas, how to only read one sheet from Excel workbook without loading any of the other sheets?

This is a little different from similar questions to this. I am using the FactSet plugin for Excel, if anyone's familiar. I have a workbook with multiple tabs that use the FactSet plugin to pull in data. No issues with this.

The problem is FactSet generates a hidden cache sheet for every Excel workbook that when read in causes loading any sheet of the Excel file by python to return an empty DataFrame.

I've tried two methods to get the data from the sheet, but both methods load the entire workbook causing the FactSet cache file to be read in.

import pandas as pd

fname = 'facset_excel_file.xlsm'

# method 1
wb = pd.ExcelFile(fname)
symbols = pd.read_excel(wb, sheet_name='Symbols')  # returns empty DataFrame

# method 2
pd.read_excel(fname, sheet_name='Symbols')  # returns empty DataFrame

So my questions are how can I read in a single sheet without loading the entire workbook? Or is it possible to exclude a worksheet by name when loading a workbook?

You can use something like, the file can be read using the file name as string or an open file object:

df = pd.read_excel(open('facset_excel_file.xlsm', 'rb'), sheet_name='123')

Try it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM