简体   繁体   中英

Calculate duration to specific percentage change for each row in R

For each Group and Date, I would like to know when the percent change for column value increases by 1% or ore . More specifically, I would like to know the duration in days when each value increases by 1% or more. For example, for Group A, it took 8 days for the value to increase 1% starting on 11/1/17. (101-100)/100. So, for the next row (Group A, 11/2/17), it took 7 days. And, for (Group B, 11/1/17), it took 3 days to increase by 1% or more (105-100)/100.

    +-------+---------+--------+
| Group |  Date   | value  |
+-------+---------+--------+
| A     | 11/1/17 |    100 |
| A     | 11/2/17 |    100 |
| A     | 11/3/17 |    100 |
| A     | 11/4/17 |    100 |
| A     | 11/5/17 |    100 |
| A     | 11/6/17 |    100 |
| A     | 11/7/17 |    100 |
| A     | 11/8/17 |    100 |
| A     | 11/9/17 |    101 |
| B     | 11/1/17 |    100 |
| B     | 11/2/17 |    100 |
| B     | 11/3/17 |    100 |
| B     | 11/4/17 |    105 |
| B     | 11/5/17 |    100 |
| B     | 11/6/17 |    107 |
| B     | 11/7/17 |    100 |
| B     | 11/8/17 |    100 |
+-------+---------+--------+

This is the desired output,

+-------+---------+--------+---------------------------------+
| Group |  Date   | value  | next_1_percent_or_higher_change |
+-------+---------+--------+---------------------------------+
| A     | 11/1/17 |    100 | 8                               |
| A     | 11/2/17 |    100 | 7                               |
| A     | 11/3/17 |    100 | 6                               |
| A     | 11/4/17 |    100 | 5                               |
| A     | 11/5/17 |    100 | 4                               |
| A     | 11/6/17 |    100 | 3                               |
| A     | 11/7/17 |    100 | 2                               |
| A     | 11/8/17 |    100 | 1                               |
| A     | 11/9/17 |    101 | NA                              |
| B     | 11/1/17 |    100 | 3                               |
| B     | 11/2/17 |    100 | 2                               |
| B     | 11/3/17 |    100 | 1                               |
| B     | 11/4/17 |    105 | 2                               |
| B     | 11/5/17 |    100 | 1                               |
| B     | 11/6/17 |    107 | NA                              |
| B     | 11/7/17 |    100 | NA                              |
| B     | 11/8/17 |    100 | NA                              |
+-------+---------+--------+---------------------------------+

Update

This is what I have so far, however, my solution is not scalable.

shift <- function(x, n){
   c(x[-(seq(n))], rep(NA, n))
 }




df= do.call(rbind,by(df,df$Group, transform,next_1_percent_or_higher_change =
                        ifelse(((shift(value,1)-value)/value) >= .01,1,
                               ifelse(((shift(value,2)-value)/value) >= .01,2,
                               ifelse(((shift(value,3)-value)/value) >= .01,3,
                                      ifelse(((shift(value,4)-value)/value) >= .01,4,
                                             ifelse(((shift(value,5)-value)/value) >= .01,5,
                                                    ifelse(((shift(value,6)-value)/value) >= .01,6,
                                                           ifelse(((shift(value,7)-value)/value) >= .01,7,
                                                                  ifelse(((shift(value,8)-value)/value) >= .01,8,
                                                                         ifelse(((shift(value,9)-value)/value) >= .01,9,NA)))))))))))

Perhaps something like this?

library(tidyverse)
library(lubridate)
df %>%
    group_by(Group) %>%
    arrange(Group, Date) %>%
    mutate(
        Date = mdy(Date),
        next_1_percent_or_higher_change  = Date[which(value == 101)] - Date) %>%
    mutate(next_1_percent_or_higher_change  = replace(next_1_percent_or_higher_change, next_1_percent_or_higher_change <= 0, NA))
## A tibble: 17 x 4
## Groups:   Group [2]
#   Group Date       value next_1_percent_or_higher_change
#   <fct> <date>     <dbl> <time>
# 1 A     2017-11-01  100. 8
# 2 A     2017-11-02  100. 7
# 3 A     2017-11-03  100. 6
# 4 A     2017-11-04  100. 5
# 5 A     2017-11-05  100. 4
# 6 A     2017-11-06  100. 3
# 7 A     2017-11-07  100. 2
# 8 A     2017-11-08  100. 1
# 9 A     2017-11-09  101. NA
#10 B     2017-11-01  100. 3
#11 B     2017-11-02  100. 2
#12 B     2017-11-03  100. 1
#13 B     2017-11-04  101. NA
#14 B     2017-11-05  100. NA
#15 B     2017-11-06  100. NA
#16 B     2017-11-07  100. NA
#17 B     2017-11-08  100. NA

Sample data

df <- read.table(text =
    "Group   Date    value
 A      11/1/17     100
 A      11/2/17  100.01
 A      11/3/17  100.02
 A      11/4/17  100.03
 A      11/5/17  100.04
 A      11/6/17  100.05
 A      11/7/17  100.06
 A      11/8/17  100.07
 A      11/9/17     101
 B      11/1/17  100.01
 B      11/2/17  100.02
 B      11/3/17  100.03
 B      11/4/17     101
 B      11/5/17  100.05
 B      11/6/17  100.06
 B      11/7/17  100.07
 B      11/8/17  100.07 ", header = T)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM