简体   繁体   English

Python Pandas-将目录中的csv文件合并为一个

[英]Python pandas - merge csv files in directory into one

I have a directory with csv files: 我有一个包含csv文件的目录:

frames/df1.csv
       df2.csv

frames are structured like this: 框架的结构如下:

df1.csv

               artist            track        plays
1            Pearl Jam           Jeremy         456
2   The Rolling Stones   Heart of Stone         546

df2.csv

                artist            track        likes
3            Pearl Jam           Jeremy         5673
9   The Rolling Stones   Heart of Stone         3456

and I would like to merge all frames into one, ending up with: 我想将所有帧合并为一个,最后得到:

              artist            track          plays       likes    
0          Pearl Jam           Jeremy            456        5673       
1 The Rolling Stones   Heart of Stone            546        3456       

I've tried: 我试过了:

path = 'frames'
all_files = glob.glob(path + "/*.csv")
list_ = []
for file_ in all_files:
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
frame = pd.concat(list_)

to no avail. 无济于事。 what is the best way to approach this? 解决此问题的最佳方法是什么?

I just simply using your code create the list of DataFrame 我只是简单地使用您的代码创建DataFrame的列表

path = 'frames'
all_files = glob.glob(path + "/*.csv")
l= []
for file_ in all_files:
    df = pd.read_csv(file_,index_col=None, header=0)
    l.append(df)

Then using functools.reduce , merge the list dataframe into one 然后使用functools.reduce ,将列表数据functools.reduce合并为一个

import functools
l= [df1, df2, df3....]
merged_df = functools.reduce(lambda left,right: pd.merge(left,right,on=['artist','track']), l)

DataFrame.join is useful. DataFrame.join很有用。 Its analogous to a SQL join. 它类似于SQL连接。 Something like: 就像是:

df1.join(df2, on=('artist', 'track'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM