How to calculate mean for previous 7 days with in same time

Question

Hi I have data frame as below

In the below df how can we replace/find NA's in "Output" column which gives average for last 7 days with same time. Eg: If value for 2014-02-08 00:45 having NA then we need to replace with previous 7 average value ie mean of values in from (feb 1 to feb 7) with same time(00:45)

dates = c('21-01-2014 00:15', '21-01-2014 00:30','21-01-2014 00:45','22-01-2014 00:00','22-01-2014 00:30','22-01-2014 00:45','23-01-2014 00:00','23-01-2014 00:15','23-01-2014 00:45','25-01-2014 00:45','26-01-2014 00:45','26-01-2014 00:46','26-01-2014 00:30','27-02-2014 00:45','28-02-2014 00:45','29-03-2014 00:45','30-03-2014 00:00','30-03-2014 00:45','30-03-2014 00:45','31-03-2014 00:45','01-04-2014 00:45','02-04-2014 00:45','03-04-2014 00:45')
value = c(20,   5,  10, 23, NA, 22, 12, 10, NA, 12, NA, 4,  19, 12, 
          NA,   NA, 2,  2,  NA, 14, NA, 21, NA)
output =c(20,   5,  10, 23, 5,  22, 12, 10, 10, 12, 11, 4,  19, 12,
          14,   14, 2,  2,  11.6,   14, 12, 21, 13.28)

df=data.frame(dates, value,output)

    df$dates = as.POSIXct(strptime(df$dates, format = "%d-%m-%Y %H:%M","GMT"))

Thanks in advance..

Answer 1

You can loop through the rows.

library(data.table)
library(dplyr)
df <- df %>% as.data.table()
for(index in 1:nrow(df)){ # index <- 23
  print(index)
  if(df[index, value] %>% is.na()){
    if(index >= 7){
    df[index, value := df[(index - 7):(index-1), value] %>% mean()] 
    }else
    {
      df[index, value:=df[1:index-1, value] %>% mean()] 
    }
  }
}

I used data.table because I am more familiar with that. I guess you can continue with data.frames if you want after the processing.

tell me if this is whatyou want

Answer 2

I would try to join the dataframe with itself on the conditions that two rows match if they are part of the group of rows that you want to find the average of.

library(data.table)
dt <- data.table(df)
dt[ , c("id", "dates_tmp1", "dates_tmp2", "dates_7", "time")
 := list(1:nrow(dt), dates, dates, dates - as.difftime(7, unit="days"), strftime(dates, format="%H:%M:%S"))]

Created some temporary columns for the join to not destroy the old data.

joined <- dt[dt, on=.(dates_tmp1>=dates_tmp1, dates_7<=dates_tmp2, time==time), allow=TRUE]
mean_values <- joined[ , list(mean_value=mean(i.value, na.rm = TRUE)), by = "id"]
mean_values <- mean_values[order(id)]
    id mean_value
 1:  1   20.00000
 2:  2    5.00000
 3:  3   10.00000
 4:  4   23.00000
 5:  5    5.00000
 6:  6   16.00000

Take these values to replace the NA ones.

If you want the last 7 days that occur in then you can create a new column that enumerates the days and then do the same.

dt[ , c("id",  "time"):= list(1:nrow(dt),strftime(dates, format="%H:%M:%S"))]
dt[ , days := as.numeric(frank(as.Date(dates), ties.method = "dense")), by = time]
dt[ , days_7:=days - 7]
joined <- dt[dt, on=.(days>=days, days_7<=days, time==time), allow=TRUE]
mean_values <- joined[ , list(mean_value=mean(i.value, na.rm = TRUE)), by = "id"]

How to calculate mean for previous 7 days with in same time

Question

2 answers

solution1
0 2017-08-23 09:11:03

solution2
0 ACCPTED 2017-08-23 09:14:34

How to calculate mean for previous 7 days with in same time

Question

2 answers

solution1 0 2017-08-23 09:11:03

solution2 0 ACCPTED 2017-08-23 09:14:34

solution1
0 2017-08-23 09:11:03

solution2
0 ACCPTED 2017-08-23 09:14:34