[英]How to use Python pandas Df to merge csvs with more than 1 same column and add only different columns
这个问题类似于简单的mysql操作-
UPDATE hpaai_month_div t, fahafa_monthly s SET t.col1=s.col1 WHERE t.col2=s.col2 AND t.year=s.year AND t.month=s.month;
资料:
CSV A month year col2 col1 abc 2000 DEFSSDS 190 def 2001 GHISFDS 210 ghi 2002 SJDYHGF 910 CSV B month year col2 col1 stat_fips abc 2000 DEFSSDS 0 a def 2001 GHISFDS 0 b ghi 2002 SJDYHGF 0 c Resulting CSV : month year col2 col1 stat_fips abc 2000 DEFSSDS 190 a def 2001 GHISFDS 210 b ghi 2002 SJDYHGF 910 c
到目前为止的代码:(无法正常工作)
df_a = pd.read_csv('a.csv')
df_b = pd.read_csv('b.csv')
merged_df = pd.merge(df_a, df_b, on="col1", how="left")
merged_df = pd.concat([merged_df], axis=1)
merged_df.to_csv('final_output.csv', encoding='utf-8', index=False)
print open('final_output.csv').read()
如何获取数据作为结果CSV
看来您需要merge
,最后删除列col_
:
#default inner join
df = pd.merge(df1, df2, on=['col2','year','month'], suffixes=('','_'))
.drop('col1_',axis=1)
print (df)
month year col2 col1 stat_fips
0 abc 2000 DEFSSDS 190 a
1 def 2001 GHISFDS 210 b
2 ghi 2002 SJDYHGF 910 c
df = pd.merge(df1, df2, on=['col2','year','month'])
print (df)
month year col2 col1_x col1_y stat_fips
0 abc 2000 DEFSSDS 190 0 a
1 def 2001 GHISFDS 210 0 b
2 ghi 2002 SJDYHGF 910 0 c
df = pd.merge(df1, df2, on=['col2','year','month'], suffixes=('','_'))
print (df)
month year col2 col1 col1_ stat_fips
0 abc 2000 DEFSSDS 190 0 a
1 def 2001 GHISFDS 210 0 b
2 ghi 2002 SJDYHGF 910 0 c
如果您提前从'df_b'
删除'col1'
,则可以使用默认设置进行merge
。
df_a.merge(df_b.drop('col1', 1))
month year col2 col1 stat_fips
0 abc 2000 DEFSSDS 190 a
1 def 2001 GHISFDS 210 b
2 ghi 2002 SJDYHGF 910 c
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.