繁体   English   中英

如何将两个 csv 文件组合在一起

[英]How to combine two csv files together

I already looked at: How to combine 2 csv files with common column value, but both files have different number of lines and: Merging two CSV files using Python But both did not give the desired output I needed.

我有两个 csv 文件,其中包含以下数据:

第一个文件是data1.csv

Name             Dept        Company  
John Smith       candy       lead
Diana Princ      candy       lead
Perry Plat       wood        lead
Jerry Springer   clothes     lead
Calvin Klein     clothes     lead   
Lincoln Tun      warehouse   lead   
Oliver Twist     kitchen     lead

第二个文件是data2.csv

Name             Dept        Company  
John Smith       candy       lead
Tyler Perry      candy       lead
Perry Plat       wood        lead
Mary Poppins     clothes     lead
Calvin Klein     clothes     lead   
Lincoln Tun      warehouse   lead   
Herman Sherman   kitchen     lead
Jerry Springer   clothes     lead
Ivan Evans       clothes     lead

我想将它们合并为一个文件,称为newdata.csv ,将Dept列分组并删除Company列。 最终的 output 看起来像这样:

Name             Dept        
John Smith       candy       
Diana Princ      candy       
Tyler Perry      candy       
Perry Plat       wood       
Jerry Springer   clothes     
Calvin Klein     clothes     
Mary Poppins     clothes     
Ivan Evans       clothes     
Lincoln Tun      warehouse   
Oliver Twist     kitchen     
Herman Sherman   kitchen   

我尝试使用合并 function,但 output 不是我需要的。

到目前为止,这是我的代码:

import pandas as pd
import os, csv, sys

csvPath1 = 'data1.csv'
csvPath2 = 'data2.csv'
csvDest = 'newdata.csv'

df1 = pd.read_csv(csvPath1)
df2 = pd.read_csv(csvPath2)

df1=df1.drop('Company', 1)
df2=df2.drop('Company', 1)

merged = df1.merge(df2)
merged=merged.sort_values('Dept')

merged.to_csv(csvDest, index=False)

合并是 SQL 等效于连接。

您需要的 function 是 concat

merged = pd.concat([df1, df2], axis=0, ignore_index=True)

我最终找到了我自己问题的答案。 我做了一些挖掘,对我有用的是:

merged=df1.append(df2)
merged=merged.sort_values('Dept')

所以我的最终代码 output:

import pandas as pd
import os, csv, sys

csvPath1 = 'data1.csv'
csvPath2 = 'data2.csv'
csvDest = 'newdata.csv'

df1 = pd.read_csv(csvPath1)
df2 = pd.read_csv(csvPath2)

df1=df1.drop('Company', 1)
df2=df2.drop('Company', 1)

merged=df1.append(df2)
merged=merged.sort_values('Dept')

merged.to_csv(csvDest, index=False)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM