I have an excel file with around 50 sheets. All the data in the excel file looks like below:
I want to read the first row and all the rows after 'finish' in the first column.
I have written my script as something like this.
df = pd.read_excel('excel1_all_data.xlsx')
df = df.head(1).append(df[df.index>df.iloc[:,0] == 'finish'].index[0]+1])
The start and finish are gone. My question is - How can I iterate through all the sheets in a similar way and append them into one dataframe? Also have a column which is the sheet name please. The data in other sheets is similar too, but will have different dates and Names. But start and finish will still be present and we want to get everything after 'finish'.
Thank you so much for your help
Try this code and let me know if if it works for you:
import pandas as pd
wbSheets = pd.ExcelFile("excel1_all_data.xlsx").sheet_names
frames = []
for st in wbSheets:
df = pd.read_excel("excel1_all_data.xlsx",st)
frames.append(df.iloc[[0]])
frames.append(df[5:])
res = pd.concat(frames)
print(res)
The pd.ExcelFile("excel1_all_data.xlsx").sheet_names is what will get you the sheet you need to iterate.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.