I have several folders - 3 which contain similar csv's(same name). These csv's capture the correlation with dependent variable for each data period and churn period combination as below:
Data Period Jan'18
Churn Period Feb'18
Variable_Name correlation
Pending_Disconnect 0.553395448
status_Active 0.539464806
days_active 0.414774231
days_pend_disco 0.392915837
prop_tenure 0.074321692
abs_change_3m 0.062267386
So from 3 folders, 3 different content csv's but with same name are being collated into a workbook as shown below:
Data Period Jan'18 Data Period Jan'18 Data Period Jan'18
Churn Period Feb'18 Churn Period Mar'18 Churn Period Apr'18
Variable_Name correlation Variable_Name correlation Variable_Name correlation
Pending_Disconnect 0.553395448 Pending_Change 0.043461995 active_frq_N 0.025697016
status_Active 0.539464806 status_Active 0.038057697 active_frq_Y 0.025697016
days_active 0.414774231 ethnic 0.037503202 ethnic 0.025195149
days_pend_disco 0.392915837 days_active 0.037227245 ecgroup 0.023192408
prop_tenure 0.074321692 archetype_grp 0.035761434 age 0.023121305
abs_change_3m 0.062267386 age_nan 0.035761434 archetype_nan 0.023121305
The objective is to compare how the correlations are changing Month On Month.
How do I extract the csv's from different folders and collate them into a single sheet of a workbook in excel using python? Currently I manually paste the content of each csv into the excel sheet and create the report but I need to automate this.
Can somone please help me with this?
The folder structure looks like below:
And the excel sheet should appear as below after the operation:
You can do something like this:
import glob
rootdir = '/home/my/folders' ## Give the path before folders 1 Jan-Feb,2Jan-Mar, etc.. )
f = list()
for subdir, dirs, files in os.walk(rootdir):
for d in dirs:
f.append(glob.glob(rootdir + '/' + d + '/*.csv'))
f = list(filter(None, f)) # Removes empty elements from the list
# f contains csv files from all folders
Now, create dataframes for all the csv's in list f
dfs = [pd.read_csv(file) for file in f[0]] # f[0] because f is a list of lists
df = pd.concat(dfs)
This has combined all your dataframes in one df
.
Now, you can write this into excel using to_excel()
function of pandas.
Note: you might have to play a bit with your dataframes to make them concatenate properly.
Let me know if this helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.