在唯一ID上连接两个数据框，但如果id不存在，则使用另一个值

Question

I have two dataframes as such: 我有两个这样的数据框：

UID    mainColumn .... (other columns of data)
1      apple
2      orange
3      apple
4      orange
5      berry
....

UID2   mainColumn2
1      truck
3      car
4      boat
5      plane
...

I need to join the second dataframe onto dataframe based on UID, however if df2 does not contain a uid, then the maincolumn value is the one I'd to use. 我需要将第二个数据框加入基于UID的数据框，但是，如果df2不包含uid，则maincolumn值就是我要使用的值。 In the above example, UID2 does not contain the value 2, so the final table would look something like 在上面的示例中，UID2不包含值2，因此最终表看起来像

UID    mainColumn ....
1      truck
2      orange
3      car
4      boat
5      plane
...

Now I'm aware we can do something in the form of 现在我知道我们可以以

df1=df1.merge(df2,left_on='UID', right_on='UID2')

But the issue I have is not replacing the missing values, and making sure they are still included. 但我遇到的问题不是替换丢失的值，并确保仍将其包括在内。 Thanks! 谢谢！

Answer 1

You can use combine_first() after renaming the columns of df2 as df1 (eg UID2 to UID..) : 在将df2的列重命名为df1之后，可以使用combine_first() （例如，将UID2更改为UID ..）：

df2.columns=df1.columns#be careful, rename only matching columns
final_df=df2.set_index('UID').combine_first(df1.set_index('UID')).reset_index()

  UID mainColumn
0    1      truck
1    2     orange
2    3        car
3    4       boat
4    5      plane

Answer 2

We can first use merge , then fillna the missing values and finally drop the extra column: 我们可以先使用merge ，然后fillna缺失的值，最后drop多余的列：

final = df1.merge(df2, left_on='UID', right_on='UID2', how='left').drop('UID2', axis=1)

final['mainColumn'] = final['mainColumn2'].fillna(final['mainColumn'])

final.drop('mainColumn2', axis=1, inplace=True)

   UID mainColumn
0    1      truck
1    2     orange
2    3        car
3    4       boat
4    5      plane

在唯一ID上连接两个数据框，但如果id不存在，则使用另一个值

问题描述

2 个解决方案

解决方案1
1 2019-06-10 18:18:05

解决方案2
0 2019-06-10 18:50:11

在唯一ID上连接两个数据框，但如果id不存在，则使用另一个值

问题描述

2 个解决方案

解决方案1 1 2019-06-10 18:18:05

解决方案2 0 2019-06-10 18:50:11

解决方案1
1 2019-06-10 18:18:05

解决方案2
0 2019-06-10 18:50:11