简体   繁体   中英

Combining csv's from different folders into an excel sheet using python

I have several folders - 3 which contain similar csv's(same name). These csv's capture the correlation with dependent variable for each data period and churn period combination as below:

Data Period     Jan'18      
Churn Period    Feb'18      

Variable_Name       correlation 
Pending_Disconnect  0.553395448 
status_Active       0.539464806 
days_active         0.414774231 
days_pend_disco     0.392915837 
prop_tenure         0.074321692 
abs_change_3m       0.062267386 

So from 3 folders, 3 different content csv's but with same name are being collated into a workbook as shown below:

Data Period         Jan'18              Data Period     Jan'18              Data Period     Jan'18      
Churn Period        Feb'18              Churn Period    Mar'18              Churn Period    Apr'18      

Variable_Name       correlation         Variable_Name   correlation         Variable_Name   correlation
Pending_Disconnect  0.553395448         Pending_Change  0.043461995         active_frq_N    0.025697016
status_Active       0.539464806         status_Active   0.038057697         active_frq_Y    0.025697016
days_active         0.414774231         ethnic          0.037503202         ethnic          0.025195149
days_pend_disco     0.392915837         days_active     0.037227245         ecgroup         0.023192408
prop_tenure         0.074321692         archetype_grp   0.035761434         age             0.023121305
abs_change_3m       0.062267386         age_nan         0.035761434         archetype_nan   0.023121305

The objective is to compare how the correlations are changing Month On Month.

How do I extract the csv's from different folders and collate them into a single sheet of a workbook in excel using python? Currently I manually paste the content of each csv into the excel sheet and create the report but I need to automate this.

Can somone please help me with this?

The folder structure looks like below:

在此处输入图片说明

And the excel sheet should appear as below after the operation:

在此处输入图片说明

You can do something like this:

import glob
rootdir = '/home/my/folders'  ## Give the path before folders 1 Jan-Feb,2Jan-Mar, etc.. )

f = list()
for subdir, dirs, files in os.walk(rootdir):
    for d in dirs:                                        
        f.append(glob.glob(rootdir + '/' + d + '/*.csv'))

f = list(filter(None, f)) # Removes empty elements from the list
# f contains csv files from all folders

Now, create dataframes for all the csv's in list f

dfs = [pd.read_csv(file) for file in f[0]]  # f[0] because f is a list of lists
df = pd.concat(dfs)

This has combined all your dataframes in one df .

Now, you can write this into excel using to_excel() function of pandas.

Note: you might have to play a bit with your dataframes to make them concatenate properly.

Let me know if this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM