簡體   English   中英

如何計算 R 中日期間隔的平均值?

[英]How to calculate a mean for a date interval in R?

我有一個數據集 ( data.weather ),其中有一個天氣變量 ( TMAX ),用於兩個位置 ( LATLON的組合) 和兩年。 TMAX每年有十天可用,在此模擬示例中的位置。 我需要計算 data.locs 中四行中data.locs行的平均 TMAX ( mean_TMAX )。 最后一個數據集指示我需要計算平均值的日期范圍。 那是在DATE_0DATE_1之間。

這是我所做的代碼:

library(dplyr)
library(lubridate)

data.weather <-read.csv(text = "
LAT,LON,YEAR,DATE,TMAX
36,-89,2010,1/1/2010,25
36,-89,2010,1/2/2010,25
36,-89,2010,1/3/2010,25
36,-89,2010,1/4/2010,28
36,-89,2010,1/5/2010,28
36,-89,2010,1/6/2010,29
36,-89,2010,1/7/2010,25
36,-89,2010,1/8/2010,25
36,-89,2010,1/9/2010,25
36,-89,2010,1/10/2010,28
36,-89,2011,1/1/2011,26
36,-89,2011,1/2/2011,25
36,-89,2011,1/3/2011,28
36,-89,2011,1/4/2011,26
36,-89,2011,1/5/2011,27
36,-89,2011,1/6/2011,27
36,-89,2011,1/7/2011,28
36,-89,2011,1/8/2011,29
36,-89,2011,1/9/2011,27
36,-89,2011,1/10/2011,26
40,-96,2010,1/1/2010,29
40,-96,2010,1/2/2010,28
40,-96,2010,1/3/2010,25
40,-96,2010,1/4/2010,25
40,-96,2010,1/5/2010,28
40,-96,2010,1/6/2010,29
40,-96,2010,1/7/2010,26
40,-96,2010,1/8/2010,28
40,-96,2010,1/9/2010,26
40,-96,2010,1/10/2010,25
40,-96,2011,1/1/2011,29
40,-96,2011,1/2/2011,27
40,-96,2011,1/3/2011,29
40,-96,2011,1/4/2011,25
40,-96,2011,1/5/2011,28
40,-96,2011,1/6/2011,29
40,-96,2011,1/7/2011,29
40,-96,2011,1/8/2011,25
40,-96,2011,1/9/2011,25
40,-96,2011,1/10/2011,26
") %>%
  mutate(DATE = as.Date(DATE, format = "%m/%d/%Y"))

data.locs <-read.csv(text = "
LAT,LON,YEAR,DATE_0,DATE_1,GEN,PR
36,-89,2010,1/2/2010,1/9/2010,MN103,35
36,-89,2011,1/1/2011,1/10/2011,IA100,33
40,-96,2010,1/4/2010,1/8/2010,MN103,36
40,-96,2011,1/2/2011,1/6/2011,IA100,34
") %>%
  mutate(DATE_0 = as.Date(DATE_0, format = "%m/%d/%Y"),
         DATE_1 = as.Date(DATE_1, format = "%m/%d/%Y"))

tmax.calculation <- data.locs %>%
  group_by(LAT,LON,YEAR, GEN) %>%
  mutate(mean_TMAX = mean(data.weather$TMAX[data.weather$DATE %within% interval(DATE_0, DATE_1)]))

這是預期的結果:

LAT LON YEAR  DATE_0    DATE_1    GEN    PR  meam_tmax
36  -89 2010  1/2/2010  1/9/2010  MN103  35  26.25
36  -89 2011  1/1/2011  1/10/2011 IA100  33  26.90
40  -96 2010  1/4/2010  1/8/2010  MN103  36  27.20
40  -96 2011  1/2/2011  1/6/2011  IA100  34  27.60

但是,這就是我得到的:

LAT LON YEAR  DATE_0    DATE_1    GEN    PR  meam_tmax
36  -89 2010  1/2/2010  1/9/2010  MN103  35  26.5625
36  -89 2011  1/1/2011  1/10/2011 IA100  33  27.0500
40  -96 2010  1/4/2010  1/8/2010  MN103  36  27.1000
40  -96 2011  1/2/2011  1/6/2011  IA100  34  27.1000

我遇到的問題是,當讀取data.weather中的數據間隔時,計算是在正確的間隔內進行的,但跨越兩個位置( LATLON的組合)。 我找不到一種方法來指示僅分別計算每個LATLON組合的平均值。

這應該這樣做:

library(dplyr)
library(lubridate)

data.weather <-read.csv(text = "
LAT,LON,YEAR,DATE,TMAX
36,-89,2010,1/1/2010,25
36,-89,2010,1/2/2010,25
36,-89,2010,1/3/2010,25
36,-89,2010,1/4/2010,28
36,-89,2010,1/5/2010,28
36,-89,2010,1/6/2010,29
36,-89,2010,1/7/2010,25
36,-89,2010,1/8/2010,25
36,-89,2010,1/9/2010,25
36,-89,2010,1/10/2010,28
36,-89,2011,1/1/2011,26
36,-89,2011,1/2/2011,25
36,-89,2011,1/3/2011,28
36,-89,2011,1/4/2011,26
36,-89,2011,1/5/2011,27
36,-89,2011,1/6/2011,27
36,-89,2011,1/7/2011,28
36,-89,2011,1/8/2011,29
36,-89,2011,1/9/2011,27
36,-89,2011,1/10/2011,26
40,-96,2010,1/1/2010,29
40,-96,2010,1/2/2010,28
40,-96,2010,1/3/2010,25
40,-96,2010,1/4/2010,25
40,-96,2010,1/5/2010,28
40,-96,2010,1/6/2010,29
40,-96,2010,1/7/2010,26
40,-96,2010,1/8/2010,28
40,-96,2010,1/9/2010,26
40,-96,2010,1/10/2010,25
40,-96,2011,1/1/2011,29
40,-96,2011,1/2/2011,27
40,-96,2011,1/3/2011,29
40,-96,2011,1/4/2011,25
40,-96,2011,1/5/2011,28
40,-96,2011,1/6/2011,29
40,-96,2011,1/7/2011,29
40,-96,2011,1/8/2011,25
40,-96,2011,1/9/2011,25
40,-96,2011,1/10/2011,26
") %>%
  mutate(DATE = as.Date(DATE, format = "%m/%d/%Y"))

data.locs <-read.csv(text = "
LAT,LON,YEAR,DATE_0,DATE_1,GEN,PR
36,-89,2010,1/2/2010,1/9/2010,MN103,35
36,-89,2011,1/1/2011,1/10/2011,IA100,33
40,-96,2010,1/4/2010,1/8/2010,MN103,36
40,-96,2011,1/2/2011,1/6/2011,IA100,34
") %>%
  mutate(DATE_0 = as.Date(DATE_0, format = "%m/%d/%Y"),
         DATE_1 = as.Date(DATE_1, format = "%m/%d/%Y"))


tmax.calculation <- data.locs %>%
  group_by(LAT,LON,YEAR,GEN) %>%
  do(data.frame(LAT=.$LAT, 
                LON=.$LON,
                YEAR=.$YEAR,
                GEN=.$GEN,
                DATE=seq(.$DATE_0, .$DATE_1, by="days"))) %>%
  left_join(data.weather, by=c("LAT", "LON", "YEAR", "DATE")) %>%
  summarise(mean_TMAX = mean(TMAX))

結果:

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM