[英]pandas merge asof with more than one match
I would like to pandas merge_asof join the following dataframes我想 pandas merge_asof 加入以下数据帧
ll = pd.DataFrame([[pd.to_datetime('2010-01-01')], [pd.to_datetime('2010-02-01')]], columns = ['date_left'])
rr = pd.DataFrame([[pd.to_datetime('2010-01-01'), 12],
[pd.to_datetime('2010-01-01'), 6]], columns = ['date_right', 'variable'])
This is, ll:这是,ll:
date_left
0 2010-01-01
1 2010-02-01
and rr:和 rr:
date_right variable
0 2010-01-01 12
1 2010-01-01 6
The following以下
pd.merge_asof(ll, rr, left_on = 'date_left', right_on='date_right', direction='backward')
gets me得到我
date_left date_right variable
0 2010-01-01 2010-01-01 6
1 2010-02-01 2010-01-01 6
but I would like (and expect, as it is a left join)但我想(并且期望,因为它是左连接)
date_left date_right variable
0 2010-01-01 2010-01-01 6
1 2010-01-01 2010-01-01 12
2 2010-02-01 2010-01-01 6
3 2010-02-01 2010-01-01 12
How can I achieve this result?我怎样才能达到这个结果?
---- EDIT ----: Sammywemmy gave the solution to use janitors conditional_join. ---- 编辑 ----: Sammywemmy 给出了使用管理员 conditional_join 的解决方案。 This works for the minimalistic example I posted above.这适用于我上面发布的简约示例。 However, I still want the rest of the merge_asof functionality.但是,我仍然想要 merge_asof 功能的 rest。 With this I mean the following:我的意思是:
ll = pd.DataFrame([[pd.to_datetime('2010-01-01')], [pd.to_datetime('2010-02-01')],[pd.to_datetime('2010-03-01')], [pd.to_datetime('2010-04-01')]], columns = ['date_left'])
ll = ll =
date_left
0 2010-01-01
1 2010-02-01
2 2010-03-01
3 2010-04-01
and和
rr = pd.DataFrame([[pd.to_datetime('2010-01-01'), 12],
[pd.to_datetime('2010-01-01'), 6],
[pd.to_datetime('2010-03-01'), 3]], columns = ['date_right', 'variable'])
rr = rr =
date_right variable
0 2010-01-01 12
1 2010-01-01 6
2 2010-03-01 3
Then I would like:然后我想:
date_left date_right variable
0 2010-01-01 2010-01-01 6
1 2010-01-01 2010-01-01 12
2 2010-02-01 2010-01-01 6
3 2010-02-01 2010-01-01 12
4 2010-03-01 2010-03-01 3
5 2010-04-01 2010-03-01 3
Whereas the conditional join would give me:而有条件的加入会给我:
date_left date_right variable
0 2010-01-01 2010-01-01 12
1 2010-01-01 2010-01-01 6
2 2010-02-01 2010-01-01 12
3 2010-02-01 2010-01-01 6
4 2010-03-01 2010-01-01 12
5 2010-03-01 2010-01-01 6
6 2010-03-01 2010-03-01 3
7 2010-04-01 2010-01-01 12
8 2010-04-01 2010-01-01 6
9 2010-04-01 2010-03-01 3
thanks谢谢
One option is with the conditional_join from pyjanitor :一种选择是使用pyjanitor的conditional_join :
# pip install pyjanitor
import pandas as pd
import janitor
ll.conditional_join(rr,
# column from left, column from right, operator
('date_left', 'date_right', '>='),
how = 'left')
date_left date_right variable
0 2010-01-01 2010-01-01 12
1 2010-01-01 2010-01-01 6
2 2010-02-01 2010-01-01 12
3 2010-02-01 2010-01-01 6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.