简体   繁体   中英

How to efficiently iterate through selected Excel sheets in Python and append them into a Data Frame?

Instead of manually inputing Excel sheets parameters as follows:

import pandas as pd

df1 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
                     sheet_name = "Europe",usecols=[1,2,4,6],header=4)      #reads sheet "Europe", selected columns and skips first 4 rows
df1["Continent"]= "Europe"      #adds a new column with sheet name
df1=pd.DataFrame(df1)       #creates df
df1.columns=["ID", "Product", "Quantity","Price","Continent"]   #renames columns in df

df2 = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
                     sheet_name = "North America",usecols=[1,2,4,6],header=4)
df2["Continent"]= "North America"
df2=pd.DataFrame(df2)
df2.columns=["ID", "Product", "Quantity","Price","Continent"]

df = pd.concat([df1, df2])     #concats the dfs

I would like to automatically iterate through the sheets and put data from all sheets into a dataframe. I tried something like below however it doesn´t do the job as the loop takes data only from the last sheet in the list:

import pandas as pd

sheets=["Europe","North America"]

for i in sheets:
    dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
                     sheet_name = i,usecols=[1,2,4,6],header=4)     #read Excel
    dataset["Continent"]= i         #adds a new column with sheet name
    dataset = pd.DataFrame(dataset)     #creates df
    dataset.columns=["ID", "Product", "Quantity","Price","Continent"]   #renames columns in df
    df= dataset.append(dataset) #this should append data from sheets into a single df

Do you have any ideas please how could I solve this?

Thanks a lot

No need to create a new dataframe when dataset is already a dataframe.

import pandas as pd

sheets=["Europe","North America"]
df_list=[]
for i in sheets:
    dataset = pd.read_excel(r"C:\Users\XY\Sales2020.xlsm",
                     sheet_name = i,usecols=[1,2,4,6],header=4)     #read Excel
    dataset["Continent"]= i         #adds a new column with sheet name
    df_list.append(dataset)
    
 df=pd.concat(df_list)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM