在带有 ifelse 语句的 dplyr 中，是否有与 group_by 领先/滞后类似的 Pandas/numpy 函数？

Question

I'm analyzing contracts to see which ones were paid exactly on time or not in python.我正在分析合同，看看哪些合同是按时支付的，哪些是在 python 中没有支付的。 These are contracts where the outstanding_balance is == 0 on the maturity date of the contract is up.这些是合约到期日的未结余额 == 0 的合约。

The pandas DataFrame I'm using is:我正在使用的熊猫数据帧是：

example_data = {'contract_no': [1,1,2,2,3,3],
                'maturity_date': ['2019-01-02', '2019-01-02', '2019-01-02', '2019-01-02', '2019-01-02', '2019-01-02'],
                'date_report_created': ['2019-01-01', '2019-01-02', '2019-01-01', '2019-01-02', '2019-01-01', '2019-01-02'],
                'outstanding_balance': [10, 0, 20, 20, 0, 0]}
example_data = pd.DataFrame(example_data, columns = ['contract_no',
                                                     'maturity_date',
                                                     'date_report_created',
                                                     'outstanding_balance'])

This is the dataframe below.这是下面的数据框。 As you can see, contract_no == 1, the outstanding balance is paid (on time) when the maturity_date == date_report_created.如您所见，contract_no == 1，当maturity_date == date_report_created 时（按时）支付未结余额。 For the 2nd contract, this is paid late, and for the 3rd one, this is paid early.对于第二份合同，这是延迟支付的，而对于第三份合同，这是提前支付的。

Essentially I'm looking to find contracts for which the outstanding_balance == 0 for the first time when maturity_date == date_report_created.本质上，当maturity_date == date_report_created 时，我希望找到第一次outward_balance == 0 的合约。

   contract_no maturity_date date_report_created  outstanding_balance
0            1    2019-01-02          2019-01-01                   10
1            1    2019-01-02          2019-01-02                    0
2            2    2019-01-02          2019-01-01                   20
3            2    2019-01-02          2019-01-02                   20
4            3    2019-01-02          2019-01-01                    0
5            3    2019-01-02          2019-01-02                    0

and this is what I'd like the output to be:这就是我想要的输出：

   contract_no maturity_date date_report_created  outstanding_balance  paid_on_time
0            1    2019-01-02          2019-01-01                   10             1
1            1    2019-01-02          2019-01-02                    0             1
2            2    2019-01-02          2019-01-01                   20             0
3            2    2019-01-02          2019-01-02                   20             0
4            3    2019-01-02          2019-01-01                    0             0
5            3    2019-01-02          2019-01-02                    0             0

I've tried to achieve this with pandas / numpy in python 3. I'd be really grateful if anyone knows how to achieve this, I know it'll require a groupby() statement on the contract_no and some ifelse() lag/lead logic somewhere!我已经尝试在 python 3 中使用 pandas / numpy 来实现这一点。如果有人知道如何实现这一点，我将不胜感激，我知道它需要在 contract_no 上使用 groupby() 语句和一些 ifelse() 滞后/在某处引导逻辑！

Answer 1

Using transform + idxmax使用transform + idxmax

cno = example_data['contract_no']
ob = example_data['outstanding_balance']
md = example_data['maturity_date']
drc = example_data['date_report_created']

i = ob.eq(0).groupby(cno).transform('idxmax')
j = md.eq(drc).groupby(cno).transform('idxmax')

i.eq(j).view('i1')

0    1
1    1
2    0
3    0
4    0
5    0
dtype: int8

在带有 ifelse 语句的 dplyr 中，是否有与 group_by 领先/滞后类似的 Pandas/numpy 函数？

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-07-23 12:42:52

在带有 ifelse 语句的 dplyr 中，是否有与 group_by 领先/滞后类似的 Pandas/numpy 函数？

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-07-23 12:42:52

解决方案1
2 已采纳 2019-07-23 12:42:52