简体   繁体   中英

How to Calculate rolling average of a month Day by Day with flexible window in r?

I am trying to calculate rolling average of covid cases in the month of march day by day.

For example on 5th of march it should take the mean of cases for first 5 days of march, on 20th it should take mean of first 20 days.

I have written a small piece of code for this but is there a prebuilt function or a better way of doing this ?

df :

Country.Region Date       Cases_count
   <chr>          <date>           <dbl>
 1 France         2021-03-01        4730
 2 France         2021-03-02       22872
 3 France         2021-03-03       26903
 4 France         2021-03-04       25286
 5 France         2021-03-05       23507
 6 France         2021-03-06       23306
 7 France         2021-03-07       21835
 8 France         2021-03-08        5534
 9 France         2021-03-09       23143
10 France         2021-03-10       29674

code:

max_date <- ymd(max(df$Date))
march <- seq(ymd("2021-03-01"), ymd(max_date), by = "day")

rolling_data <- lapply(march, function(x){
  
  rolling_avg <- df %>% 
    filter( 
           Country.Region == "France", 
           Date %in% c(ymd("2021-03-01"): x)) %>%
      summarise(rolling_mean = mean(Cases_count)) #%>% 
    
      # from: https://stackoverflow.com/questions/61038643/loop-through-irregular-list-of-numbers-to-append-rows-to-summary-table
  data.frame(Date = x, rolling_march = rolling_avg)
})

do.call(rbind,rolling_data)

output:

      Date rolling_mean
1  2021-03-01      4730.00
2  2021-03-02     13801.00
3  2021-03-03     18168.33
4  2021-03-04     19947.75
5  2021-03-05     20659.60
6  2021-03-06     21100.67
7  2021-03-07     21205.57
8  2021-03-08     19246.62
9  2021-03-09     19679.56
10 2021-03-10     20679.00

Issue: For using this along with cases count I will have to do some join. So if there is some prebuilt function then I can probably use it with mutate or summarise.

So what you actually want is a cummulative average, not a rolling/moving average.

A way easier approach is to use cumsum . For example, if you have a vector x with N elements, the cummulative mean could be expressed as:

cummulative_mean <- cumsum(x) / seq_len(length(x))

For an actual rolling mean, the zoo pkg provides us zoo::rollmean .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM