[英]How to make a function to get average (B) when (A) is a certain condition on data.frame in R
[英]How to get the average of consecutive occurrences when meet certain condition in R
我有一个data
,其中包含按交易日期订购的客户付款时间表。我想计算连续失败付款的平均次数和连续成功付款的平均次数。表格如下所示:
customer_id |transaction_id.|failed_or_success | transaction_date
1 |1 |success |2021-01-01
1 |2 |success |2021-01-15
1 |3 |failed |2021-01-30
1 |4 |success |2021-02-15
例如,平均连续支付成功次数为(2+1)/2=1.5
,前2
来自 transaction_id 1 & 2,第二个1
来自 transaction_id 4。而连续支付失败的平均次数为在本例中为 1。 最终表格将如下所示:
cus_id |tran_id.|f_or_s |tran_date |avg_consec_fail|avg_consec_success
1 |1 |success|2021-01-01 |1 |1.5
1 |2 |success|2021-01-15 |1 |1.5
1 |3 |failed |2021-01-30 |1 |1.5
1 |4 |success|2021-02-15 |1 |1.5
我如何使用R/dplyr
实现这一点?
您可以尝试使用rle
df <- read.table(text = "customer_id transaction_id. failed_or_success transaction_date
1 1 success 2021-01-01
1 2 success 2021-01-15
1 3 failed 2021-01-30
1 4 success 2021-02-15", header = TRUE)
df %>%
mutate(avg_consec_success = mean(rle(failed_or_success)$length[rle(failed_or_success)$values != "failed"]))
customer_id transaction_id. failed_or_success transaction_date avg_consec_success
1 1 1 success 2021-01-01 1.5
2 1 2 success 2021-01-15 1.5
3 1 3 failed 2021-01-30 1.5
4 1 4 success 2021-02-15 1.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.