简体   繁体   中英

Loop through excel files, extract means from column data and add to dataframe

I want to loop through a group of files in a folder. For each file I want to locate a specific column (eg 'FF(Hz)'), find the max value in that column and add it to a single dataframe so that I have a column of the max values from each file. I had a go at doing this for 2 columns, but it just fills the columns with 1 value.

IFpath = r"C:\Users\useri\folder\testfolder"
F_files = glob.glob(IFpath + "/*.xlsx")

for file in F_files:
    fn = pd.read_excel(file,sheetname='Sheet1')  
    MaxFF = (fn['FF(Hz)'].max())    
    Maxspikes = (fn['Spike'].max())

dfsum = pd.DataFrame({'Max_FF': MaxFF, 'Max_spikes': Maxspikes})

    returns something like this 

     Max_FF  Max_spikes
      200     5
      200     5
      200     5
      ...     ...

You need to store the intermediate MaxFF and Maxspikes values while you loop over the files. Currently you are overwriting both every time you open a new file.

IFpath = r"C:\Users\useri\folder\testfolder"
F_files = glob.glob(IFpath + "/*.xlsx")

list_of_maxes = []
for file in F_files:
    fn = pd.read_excel(file,sheetname='Sheet1')  
    MaxFF = (fn['FF(Hz)'].max())    
    Maxspikes = (fn['Spike'].max())
    list_of_maxes.append([file,MaxFF,Maxspikes])


dfsum = pd.DataFrame(list_of_maxes)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM