简体   繁体   中英

R Select date range over multiple years and calculate mean of values

I have a data frame with hourly data running over 5 years. I want to calculate the hourly mean (ie, the mean value for every hour of the day, 1:24) of values between two dates (eg, 15-March to 15-Apr) over several years, and compare that to the hourly mean of the last year.

Here is an example of the data:

start = as.POSIXct(strptime("2011-01-01 01:00", "%Y-%m-%d %H:%M"))
end   = as.POSIXct(strptime("2016-01-01 01:00", "%Y-%m-%d %H:%M"))
df = data.frame(DateTime = seq(from = start, to = end,by = "hours"))
df$value = runif(nrow(df))

Start_Period = "03-15"
End_Period = "04-15"

The output should look like:

Hour   mean(2011-2014) mean(2015)
1      0.3             0.5
...
24     0.8             0.6

We can filter based on the 'start', 'end' date, then do a group by 'hour' 'year' and get the mean

library(lubridate)
library(dplyr)   
df %>%
    filter((day(DateTime) >= 15 & month(DateTime) == 3)|
          (day(DateTime) <= 15 & month(DateTime) ==  4))   %>% 
    group_by(hour = hour(DateTime), year = year(DateTime)) %>% 
    summarise(value = mean(value))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM