[英]how to cumulative sum variable by unique values and input back in
I'm looking to do the following -- cumulative sum the indicator values and remove the indicators after those days original:我希望执行以下操作——指标值的累计总和,并在原始日期之后删除指标:
transaction![]() |
day![]() |
indicator![]() |
---|---|---|
1 ![]() |
1 ![]() |
0 ![]() |
1 ![]() |
2 ![]() |
0 ![]() |
1 ![]() |
3 ![]() |
0 ![]() |
1 ![]() |
4 ![]() |
1 ![]() |
1 ![]() |
5 ![]() |
1 ![]() |
1 ![]() |
6 ![]() |
1 ![]() |
2 ![]() |
1 ![]() |
0 ![]() |
2 ![]() |
2 ![]() |
0 ![]() |
2 ![]() |
3 ![]() |
0 ![]() |
2 ![]() |
4 ![]() |
0 ![]() |
2 ![]() |
5 ![]() |
1 ![]() |
2 ![]() |
6 ![]() |
1 ![]() |
and make the new table like this --并像这样制作新表格 -
transaction![]() |
day![]() |
indicator![]() |
---|---|---|
1 ![]() |
1 ![]() |
0 ![]() |
1 ![]() |
2 ![]() |
0 ![]() |
1 ![]() |
3 ![]() |
0 ![]() |
1 ![]() |
4 ![]() |
3 ![]() |
2 ![]() |
1 ![]() |
0 ![]() |
2 ![]() |
2 ![]() |
0 ![]() |
2 ![]() |
3 ![]() |
0 ![]() |
2 ![]() |
4 ![]() |
0 ![]() |
2 ![]() |
5 ![]() |
2 ![]() |
Change all day with indicator == 1 to the first day with indicator == 1将指标 == 1 的一整天更改为指标 == 1 的第一天
df%>%
group_by(transaction)%>%
mutate(day=case_when(indicator==0~day,
T~head(day[indicator==1],1)))%>%
group_by(transaction,day)%>%
summarise(indicator=sum(indicator))%>%
ungroup
transaction day indicator
<int> <int> <int>
1 1 1 0
2 1 2 0
3 1 3 0
4 1 4 3
5 2 1 0
6 2 2 0
7 2 3 0
8 2 4 0
9 2 5 2
Please try the below code请尝试以下代码
df <- bind_rows(df1, df2) %>% group_by(transaction) %>%
mutate(cumsum=cumsum(indicator), cumsum2=ifelse(cumsum==1, day, NA)) %>%
fill(cumsum2) %>%
mutate(day=ifelse(!is.na(cumsum2), cumsum2, day)) %>%
group_by(transaction, day) %>% slice_tail(n=1) %>% select(-cumsum2)
Created on 2023-01-19 with reprex v2.0.2创建于 2023-01-19,使用reprex v2.0.2
# A tibble: 8 × 4
# Groups: transaction, day [8]
transaction day indicator cumsum
<dbl> <int> <dbl> <dbl>
1 1 1 0 0
2 1 2 0 0
3 1 3 0 0
4 1 4 1 3
5 2 1 0 0
6 2 2 0 0
7 2 3 0 0
8 2 4 1 2
Another approach to try.另一种尝试方法。 After grouping by
transaction
, change indicator
to either 0 (same) or the sum
of indicator
.按
transaction
分组后,将indicator
更改为 0(相同)或indicator
的sum
。 Finally, keep or filter
previous rows where cumall
(cumulative all) values for indicator
are 0. Using lag
will provide the last row containing the sum.最后,保留或
filter
之前的行,其中indicator
的cumall
(累计所有)值为 0。使用lag
将提供包含总和的最后一行。
library(tidyverse)
df %>%
group_by(transaction) %>%
mutate(indicator = ifelse(indicator == 0, 0, sum(indicator))) %>%
filter(cumall(lag(indicator, default = 0) == 0))
Output Output
transaction day indicator
<int> <int> <dbl>
1 1 1 0
2 1 2 0
3 1 3 0
4 1 4 3
5 2 1 0
6 2 2 0
7 2 3 0
8 2 4 0
9 2 5 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.