简体   繁体   中英

Iterate through multiple sheets in excel file and filter all the data after a value in a row and append all the sheets

I have an excel file with around 50 sheets. All the data in the excel file looks like below:

在此处输入图像描述

I want to read the first row and all the rows after 'finish' in the first column.

I have written my script as something like this.

df = pd.read_excel('excel1_all_data.xlsx')
df = df.head(1).append(df[df.index>df.iloc[:,0] == 'finish'].index[0]+1])

The output looks like below: 在此处输入图像描述

The start and finish are gone. My question is - How can I iterate through all the sheets in a similar way and append them into one dataframe? Also have a column which is the sheet name please. The data in other sheets is similar too, but will have different dates and Names. But start and finish will still be present and we want to get everything after 'finish'.

Thank you so much for your help

Try this code and let me know if if it works for you:

import pandas as pd
wbSheets = pd.ExcelFile("excel1_all_data.xlsx").sheet_names
frames = []
for st in wbSheets:
    df = pd.read_excel("excel1_all_data.xlsx",st)
    frames.append(df.iloc[[0]])
    frames.append(df[5:])
res = pd.concat(frames)
print(res)

The pd.ExcelFile("excel1_all_data.xlsx").sheet_names is what will get you the sheet you need to iterate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM