简体   繁体   中英

how to perform outer merge on multiple DataFrames using pandas in python

I have 125 csv files with same column names. I want to merge all of them (on=column). I tried the following code, but it didn't work (seems like its running in infinite loop).

filelist = glob.glob('*.csv')
dflist = []
for filename in filelist:
df = pd.read_csv(filename)
dflist.append(df)
df_2 = reduce(lambda  left,right: pd.merge(left,right,on=['gene_id'],how='outer'), dflist)

I can't use pd.concat like df_new = pd.concat([df1, df2, df3, df4], axis=1)) since csv files differ in row length.

Is there any other way to perform pd.merge on multiple files??

Thanks in advance!!

Try this code.

filelist = glob.glob('*.csv')
merged_df = pd.read_csv(filelist[0])

for filename in filelist:
    df = pd.read_csv(filename)
    merged_df = pd.merge(merged_df, df, on = ['gene_id'], how = 'outer')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM