[英]How to join pandas dataframes on two keys with a prioritized key?
How can I left join two pandas dataframes ( df1
, df2
) on two keys ( bla1
, bla2
), where the bla2
key should be used whenever it's not null (see last two rows in df1
)? 我如何在两个键( bla1
, bla2
)上留下两个熊猫数据帧( df1
, df2
),如果不为null的话应该在其中使用bla2
键(请参见df1
最后两行)?
Pseudo-code 伪代码
if bla2
is not null then join bla
on bla2
如果bla2
不为null,则在bla2
上加入bla
else join bla
on bla1
否则在bla1
上加入bla
Dataframes Dataframes
df1 DF1
| bla1 | bla2 | a | b |
|------|------|-----|-----|
| 1 | | ... | ... |
| 2 | | ... | ... |
| 3 | | ... | ... |
| 4 | 7 | ... | ... |
| 5 | 8 | ... | ... |
+ df2 + df2
| bla | x | y | z |
|-----|-----|-----|-----|
| 1 | ... | ... | ... |
| 2 | ... | ... | ... |
| 3 | ... | ... | ... |
| 7 | ... | ... | ... |
| 8 | ... | ... | ... |
= df3 = df3
| bla1 | bla2 | a | b | x | y | z |
|------|------|-----|-----|-----|-----|-----|
| 1 | | ... | ... | ... | ... | ... |
| 2 | | ... | ... | ... | ... | ... |
| 3 | | ... | ... | ... | ... | ... |
| 5 | 7 | ... | ... | ... | ... | ... |
| 4 | 8 | ... | ... | ... | ... | ... |
[First create a new column to combine both columns. [首先创建一个新列以合并两个列。
df1["new_column"] = df1.bla2.fillna(df1.bla1);
Then join both frames and drop extra created columns. 然后加入两个框架并放置额外的创建的列。
df3 = pd.merge(df1, df2, how="inner", left_on="new_column", right_on="bla").drop(["new_column", "bla"], axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.