For each Group and Date, I would like to know when the percent change for column value
increases by 1% or ore . More specifically, I would like to know the duration in days when each value increases by 1% or more. For example, for Group A, it took 8 days for the value to increase 1% starting on 11/1/17. (101-100)/100. So, for the next row (Group A, 11/2/17), it took 7 days. And, for (Group B, 11/1/17), it took 3 days to increase by 1% or more (105-100)/100.
+-------+---------+--------+
| Group | Date | value |
+-------+---------+--------+
| A | 11/1/17 | 100 |
| A | 11/2/17 | 100 |
| A | 11/3/17 | 100 |
| A | 11/4/17 | 100 |
| A | 11/5/17 | 100 |
| A | 11/6/17 | 100 |
| A | 11/7/17 | 100 |
| A | 11/8/17 | 100 |
| A | 11/9/17 | 101 |
| B | 11/1/17 | 100 |
| B | 11/2/17 | 100 |
| B | 11/3/17 | 100 |
| B | 11/4/17 | 105 |
| B | 11/5/17 | 100 |
| B | 11/6/17 | 107 |
| B | 11/7/17 | 100 |
| B | 11/8/17 | 100 |
+-------+---------+--------+
This is the desired output,
+-------+---------+--------+---------------------------------+
| Group | Date | value | next_1_percent_or_higher_change |
+-------+---------+--------+---------------------------------+
| A | 11/1/17 | 100 | 8 |
| A | 11/2/17 | 100 | 7 |
| A | 11/3/17 | 100 | 6 |
| A | 11/4/17 | 100 | 5 |
| A | 11/5/17 | 100 | 4 |
| A | 11/6/17 | 100 | 3 |
| A | 11/7/17 | 100 | 2 |
| A | 11/8/17 | 100 | 1 |
| A | 11/9/17 | 101 | NA |
| B | 11/1/17 | 100 | 3 |
| B | 11/2/17 | 100 | 2 |
| B | 11/3/17 | 100 | 1 |
| B | 11/4/17 | 105 | 2 |
| B | 11/5/17 | 100 | 1 |
| B | 11/6/17 | 107 | NA |
| B | 11/7/17 | 100 | NA |
| B | 11/8/17 | 100 | NA |
+-------+---------+--------+---------------------------------+
Update
This is what I have so far, however, my solution is not scalable.
shift <- function(x, n){
c(x[-(seq(n))], rep(NA, n))
}
df= do.call(rbind,by(df,df$Group, transform,next_1_percent_or_higher_change =
ifelse(((shift(value,1)-value)/value) >= .01,1,
ifelse(((shift(value,2)-value)/value) >= .01,2,
ifelse(((shift(value,3)-value)/value) >= .01,3,
ifelse(((shift(value,4)-value)/value) >= .01,4,
ifelse(((shift(value,5)-value)/value) >= .01,5,
ifelse(((shift(value,6)-value)/value) >= .01,6,
ifelse(((shift(value,7)-value)/value) >= .01,7,
ifelse(((shift(value,8)-value)/value) >= .01,8,
ifelse(((shift(value,9)-value)/value) >= .01,9,NA)))))))))))
Perhaps something like this?
library(tidyverse)
library(lubridate)
df %>%
group_by(Group) %>%
arrange(Group, Date) %>%
mutate(
Date = mdy(Date),
next_1_percent_or_higher_change = Date[which(value == 101)] - Date) %>%
mutate(next_1_percent_or_higher_change = replace(next_1_percent_or_higher_change, next_1_percent_or_higher_change <= 0, NA))
## A tibble: 17 x 4
## Groups: Group [2]
# Group Date value next_1_percent_or_higher_change
# <fct> <date> <dbl> <time>
# 1 A 2017-11-01 100. 8
# 2 A 2017-11-02 100. 7
# 3 A 2017-11-03 100. 6
# 4 A 2017-11-04 100. 5
# 5 A 2017-11-05 100. 4
# 6 A 2017-11-06 100. 3
# 7 A 2017-11-07 100. 2
# 8 A 2017-11-08 100. 1
# 9 A 2017-11-09 101. NA
#10 B 2017-11-01 100. 3
#11 B 2017-11-02 100. 2
#12 B 2017-11-03 100. 1
#13 B 2017-11-04 101. NA
#14 B 2017-11-05 100. NA
#15 B 2017-11-06 100. NA
#16 B 2017-11-07 100. NA
#17 B 2017-11-08 100. NA
df <- read.table(text =
"Group Date value
A 11/1/17 100
A 11/2/17 100.01
A 11/3/17 100.02
A 11/4/17 100.03
A 11/5/17 100.04
A 11/6/17 100.05
A 11/7/17 100.06
A 11/8/17 100.07
A 11/9/17 101
B 11/1/17 100.01
B 11/2/17 100.02
B 11/3/17 100.03
B 11/4/17 101
B 11/5/17 100.05
B 11/6/17 100.06
B 11/7/17 100.07
B 11/8/17 100.07 ", header = T)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.