简体   繁体   中英

How to read in multiple excel sheets into python when there is multiple possible names for a sheet

I have over 4000 xlsx files that each contain a sheet that is named ALMOST the same thing each time.

It always follows this format: XXX-XXX-001

However, the last number sometimes changes, and sometimes there is white space at the start or end of the sheet name. I've looked and there doesn't seem to be any regex option for pandas read_excel. Any suggestions? Is there some sort of 'if in ()' check I can do?

Thanks!

If memory is not an issue, you can read all the sheet in the excel file first, and then filter the sheetname

first read the excel

df_dict = pd.read_excel(filename, sheetname=None)

Then filter the df name

dfname = list[df_dict]
wanted_df_name = ['XXX-XXX-001' in ele for ele in dfname][0]

finally take the df from the df_dict

wanted_df = df_dict[wanted_df_name]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM