I have 125 csv files with same column names. I want to merge all of them (on=column). I tried the following code, but it didn't work (seems like its running in infinite loop).
filelist = glob.glob('*.csv')
dflist = []
for filename in filelist:
df = pd.read_csv(filename)
dflist.append(df)
df_2 = reduce(lambda left,right: pd.merge(left,right,on=['gene_id'],how='outer'), dflist)
I can't use pd.concat like df_new = pd.concat([df1, df2, df3, df4], axis=1)) since csv files differ in row length.
Is there any other way to perform pd.merge on multiple files??
Thanks in advance!!
Try this code.
filelist = glob.glob('*.csv')
merged_df = pd.read_csv(filelist[0])
for filename in filelist:
df = pd.read_csv(filename)
merged_df = pd.merge(merged_df, df, on = ['gene_id'], how = 'outer')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.