繁体   English   中英

pandas 合并 asof 与多个匹配项

[英]pandas merge asof with more than one match

我想 pandas merge_asof 加入以下数据帧

ll = pd.DataFrame([[pd.to_datetime('2010-01-01')], [pd.to_datetime('2010-02-01')]], columns = ['date_left'])
rr = pd.DataFrame([[pd.to_datetime('2010-01-01'), 12],
                   [pd.to_datetime('2010-01-01'), 6]], columns = ['date_right', 'variable'])

这是,ll:

    date_left
0   2010-01-01
1   2010-02-01

和 rr:

    date_right  variable
0   2010-01-01  12
1   2010-01-01  6

以下

pd.merge_asof(ll, rr, left_on = 'date_left', right_on='date_right', direction='backward')

得到我

    date_left   date_right  variable
0   2010-01-01  2010-01-01  6
1   2010-02-01  2010-01-01  6

但我想(并且期望,因为它是左连接)

    date_left   date_right  variable
0   2010-01-01  2010-01-01  6
1   2010-01-01  2010-01-01  12
2   2010-02-01  2010-01-01  6
3   2010-02-01  2010-01-01  12

我怎样才能达到这个结果?

---- 编辑 ----: Sammywemmy 给出了使用管理员 conditional_join 的解决方案。 这适用于我上面发布的简约示例。 但是,我仍然想要 merge_asof 功能的 rest。 我的意思是:

ll = pd.DataFrame([[pd.to_datetime('2010-01-01')], [pd.to_datetime('2010-02-01')],[pd.to_datetime('2010-03-01')], [pd.to_datetime('2010-04-01')]], columns = ['date_left'])

ll =

    date_left
0   2010-01-01
1   2010-02-01
2   2010-03-01
3   2010-04-01

rr = pd.DataFrame([[pd.to_datetime('2010-01-01'), 12],
                   [pd.to_datetime('2010-01-01'), 6],
                   [pd.to_datetime('2010-03-01'), 3]], columns = ['date_right', 'variable'])

rr =

date_right  variable
0   2010-01-01  12
1   2010-01-01  6
2   2010-03-01  3

然后我想:

    date_left   date_right  variable
0   2010-01-01  2010-01-01  6
1   2010-01-01  2010-01-01  12
2   2010-02-01  2010-01-01  6
3   2010-02-01  2010-01-01  12
4   2010-03-01  2010-03-01  3
5   2010-04-01  2010-03-01  3

而有条件的加入会给我:

    date_left   date_right  variable
0   2010-01-01  2010-01-01  12
1   2010-01-01  2010-01-01  6
2   2010-02-01  2010-01-01  12
3   2010-02-01  2010-01-01  6
4   2010-03-01  2010-01-01  12
5   2010-03-01  2010-01-01  6
6   2010-03-01  2010-03-01  3
7   2010-04-01  2010-01-01  12
8   2010-04-01  2010-01-01  6
9   2010-04-01  2010-03-01  3

谢谢

一种选择是使用pyjanitorconditional_join

# pip install pyjanitor
import pandas as pd
import janitor
ll.conditional_join(rr, 
                    # column from left, column from right, operator
                   ('date_left', 'date_right', '>='), 
                    how = 'left')
 
   date_left date_right  variable
0 2010-01-01 2010-01-01        12
1 2010-01-01 2010-01-01         6
2 2010-02-01 2010-01-01        12
3 2010-02-01 2010-01-01         6

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM