简体   繁体   English

Python Pandas:仅当列值唯一时,才将数据框追加到另一个数据框

[英]Python Pandas: Append Dataframe To Another Dataframe Only If Column Value is Unique

I have two data frames that I want to append together. 我有两个要附加在一起的数据框。 Below are samples. 以下是示例。

df_1: df_1:

Code    Title
103     general checks 
107     limits
421     horseshoe
319     scheduled 
501     zonal 

df_2 df_2

Code    Title
103     hello 
108     lucky eight 
421     little toe 
319     scheduled cat
503     new item 

I want to append df_2 to df_1 ONLY IF the code number in df_2 does not exist already in df_1. 我只想在df_1中不存在df_2中的代码号的情况下,才将df_2附加到df_1。

Below is the dataframe I want: 以下是我想要的数据框:

Code    Title
103     general checks 
107     limits
421     horseshoe
319     scheduled 
501     zonal 
108     lucky eight 
503     new item

I have searched through Google and Stackoverflow but couldn't find anything on this specific case. 我已经搜索过Google和Stackoverflow,但是在这种情况下找不到任何东西。

Just append the filtered data frame 只需append过滤的数据框

df3 = df2.loc[~df2.Code.isin(df.Code)]
df.append(df3)

    Code    Title
0   103 general checks
1   107 limits
2   421 horseshoe
3   319 scheduled
4   501 zonal
1   108 lucky eight
4   503 new item

Notice that you might end up with duplicated indexes, which may cause problems. 请注意,您可能最终得到重复的索引,这可能会导致问题。 To avoid that, you can .reset_index(drop=True) to get a fresh df with no duplicated indexes. 为避免这种情况,您可以.reset_index(drop=True)获得没有重复索引的新df。

df.append(df3).reset_index(drop=True)

    Code    Title
0   103 general checks
1   107 limits
2   421 horseshoe
3   319 scheduled
4   501 zonal
5   108 lucky eight
6   503 new item

You can concat and then drop_duplicates . 您可以concat ,然后drop_duplicates Assumes within each dataframe Code is unique. 假设每个数据帧中的Code都是唯一的。

res = pd.concat([df1, df2]).drop_duplicates('Code')

print(res)

   Code           Title
0   103  general_checks
1   107          limits
2   421       horseshoe
3   319       scheduled
4   501           zonal
1   108     lucky_eight
4   503        new_item

Similar to concat(), you could also use merge: 与concat()类似,您也可以使用merge:

df3 = pd.merge(df_1, df_2, how='outer').drop_duplicates('Code')

    Code    Title
0   103 general checks
1   107 limits
2   421 horseshoe
3   319 scheduled
4   501 zonal
6   108 lucky eight
9   503 new item  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM