简体   繁体   English

使用 Pandas 合并多个 CVS 文件

[英]Merging multiple CVS files using pandas

I am struggling with merging multiple .csv files using pandas.我正在努力使用 Pandas 合并多个 .csv 文件。

All of the files have the same structure as presented below, where 'UniqueColumn' varies for every csv, and column "Name" is same for every csv but they are not sorted in the same way:所有文件都具有如下所示的相同结构,其中每个 csv 的“UniqueColumn”都不同,每个 csv 的“名称”列都相同,但它们的排序方式不同:

csv1:
Name, UniqueColumnA
testName, DataA
...

csv2:
Name, UniqueColumnB
testName, DataB
...

etc.等等。

Desired merged csv file would look like this:所需的合并 csv 文件如下所示:

Name, UniqueColumnA, UniqueColumnB, UniqueColumnC
testName, DataA, DataB, DataC

I've tried to use the following code:我尝试使用以下代码:

files = glob.glob(r'pathname*.csv')
df = pd.concat([pd.read_csv(f, index_col=['Name']) for f in files])
df.to_csv('merged.csv')

But the output was但输出是

testName, DataA
testName, DataB
...

I'm not very much familiar with Python, especially with pandas so I would really appreciate a helping hand here我对 Python 不是很熟悉,尤其是熊猫,所以我真的很感激这里的帮助

有重复的索引,您需要告诉熊猫如何处理它们,在您的情况下,您需要一个inner join因此以下内容应该适合您:

df = pd.concat([pd.read_csv(f, index_col='Name') for f in files], join='inner', axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM