简体   繁体   English

合并数据帧

[英]Merge dataframes

I'm trying to merge this two dataframes:我正在尝试合并这两个数据框:

df1=  
     pais   ano  cantidad
 0  Chile  2000        10
 1  Chile  2001        11
 2  Chile  2002        12

df2=
     pais   ano  cantidad
 0  Chile  1999         0
 1  Chile  2000         0
 2  Chile  2001         0
 3  Chile  2002         0
 4  Chile  2003         0

I'm trying to merge df1 into df2 and replace the existing año rows with those from df1.我正在尝试将 df1 合并到 df2 并用来自 df1 的行替换现有的 año 行。 This is the code that I'm trying right now and what I'm getting:这是我现在正在尝试的代码以及我得到的代码:

df=df1.combine_first(df2)

df=
    pais    ano     cantidad
0   Chile   2000.0  10.0
1   Chile   2001.0  11.0
2   Chile   2002.0  12.0
3   Chile   2002.0  0.0
4   Chile   2003.0  0.0

As you can see, row corresponding to 1999 is missing and the one for 2002 with 'cantidad'= 0 shoudn't be there.如您所见,对应于 1999 的行丢失了,而 2002 年 'cantidad'= 0 的行不应该存在。 My desired output is this:我想要的输出是这样的:

df=
    pais    ano     cantidad
0   Chile   1999    0
1   Chile   2000    10
2   Chile   2001    11
3   Chile   2002    12
4   Chile   2003    0

Any ideas?有任何想法吗? Thank you!谢谢!

Add how='outer param to the merge.how='outer参数添加到合并中。

By default, merge works with "inner", which means it takes only values which are in both dataframe (intersection) while you want union of those sections.默认情况下, merge与“inner”一起使用,这意味着它只需要两个数据帧(交集)中的值,而你想要这些部分的联合。

Also, you may want to add on="ano" to declare on which column you want to merge.此外,您可能需要添加on="ano"来声明要合并的列。 It may not be needed on your case, but it's worth to check it out.您的情况可能不需要它,但值得检查一下。

Please checkPandas Merging 101 for more details请查看Pandas Merging 101了解更多详情

You can perform a left join on df2 and fillna missing values from df2.cantidad .您可以执行left joindf2fillna缺失值从df2.cantidad I'm joining on pais and ano because I assume in your real dataframe are more countries than 'chile'.我加入paisano是因为我认为在您的真实数据框中,国家/地区多于“智利”。

df = df2[['pais','ano']].merge(df1, on=['pais','ano'], how='left').fillna({'cantidad': df2.cantidad})
df.cantidad = df.cantidad.astype('int')
df

Out:出去:

    pais   ano  cantidad
0  Chile  1999         0
1  Chile  2000        10
2  Chile  2001        11
3  Chile  2002        12
4  Chile  2003         0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM