I have over 4000 xlsx files that each contain a sheet that is named ALMOST the same thing each time.
It always follows this format: XXX-XXX-001
However, the last number sometimes changes, and sometimes there is white space at the start or end of the sheet name. I've looked and there doesn't seem to be any regex option for pandas read_excel. Any suggestions? Is there some sort of 'if in ()' check I can do?
Thanks!
If memory is not an issue, you can read all the sheet in the excel file first, and then filter the sheetname
first read the excel
df_dict = pd.read_excel(filename, sheetname=None)
Then filter the df name
dfname = list[df_dict]
wanted_df_name = ['XXX-XXX-001' in ele for ele in dfname][0]
finally take the df from the df_dict
wanted_df = df_dict[wanted_df_name]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.