簡體   English   中英

聚合函數產生每日而不是每小時平均值

[英]aggregate function produces daily instead of hourly mean

我在第一列中有一個15分鍾的時間步長的data.frame,在另外16列中充滿了數據。 我想獲取每一列的每小時平均值。 我正在使用聚合,並且對於1分鍾的數據來說效果很好。

mydata <- list()
for(j in colnames(data_frame)){
  data_mean <- aggregate(data_frame[j], 
                        list(hour=cut(as.POSIXct(data_frame$TIME), "hour")),
                        mean, na.rm=TRUE)
  mydata[[j]] <- data_mean
}

當我使用相同的設置進行15分鍾的數據設置時,它給出的是每日平均值而不是每小時平均值。 知道為什么嗎?

我的數據看起來像這樣1分鍾的數據:

"TIME","Tair","RH"
2016-01-01 00:01:00,5.9,82
2016-01-01 00:02:00,5.9,82
2016-01-01 00:03:00,5.9,82
2016-01-01 00:04:00,5.89,82
2016-01-01 00:05:00,5.8,82
2016-01-01 00:06:00,5.8,82
2016-01-01 00:07:00,5.8,82
2016-01-01 00:08:00,5.8,82
2016-01-01 00:09:00,5.8,82
2016-01-01 00:10:00,5.8,82
2016-01-01 00:11:00,5.8,82
2016-01-01 00:12:00,5.8,82
2016-01-01 00:13:00,5.8,82
2016-01-01 00:14:00,5.8,82
2016-01-01 00:15:00,5.8,82
2016-01-01 00:16:00,5.8,82
2016-01-01 00:17:00,5.8,82
2016-01-01 00:18:00,5.8,82
2016-01-01 00:19:00,5.8,82
2016-01-01 00:20:00,5.8,82
2016-01-01 00:21:00,5.75,82
2016-01-01 00:22:00,5.78,82
2016-01-01 00:23:00,5.78,83
2016-01-01 00:24:00,5.8,82
2016-01-01 00:25:00,5.73,82
2016-01-01 00:26:00,5.7,82
2016-01-01 00:27:00,5.7,82
2016-01-01 00:28:00,5.7,82
2016-01-01 00:29:00,5.7,82
2016-01-01 00:30:00,5.7,82
2016-01-01 00:31:00,5.7,83
2016-01-01 00:32:00,5.76,83
2016-01-01 00:33:00,5.8,83
2016-01-01 00:34:00,5.8,82
2016-01-01 00:35:00,5.8,82
2016-01-01 00:36:00,5.8,83
2016-01-01 00:37:00,5.79,83
2016-01-01 00:38:00,5.7,82

對於15分鍾的數據:

"TIME","Tair","RH"
2016-01-01 00:15:00,6.228442,80.40858
2016-01-01 00:30:00,6.121088,81.00000
2016-01-01 00:45:00,6.075000,NA
2016-01-01 01:00:00,5.951910,NA
2016-01-01 01:15:00,5.844144,NA
2016-01-01 01:30:00,5.802242,NA
2016-01-01 01:45:00,5.747619,NA
2016-01-01 02:00:00,5.742889,NA
2016-01-01 02:15:00,5.752584,81.12135
2016-01-01 02:30:00,5.677753,81.00000
2016-01-01 02:45:00,5.500224,81.61435
2016-01-01 03:00:00,5.225282,82.29797
2016-01-01 03:15:00,5.266441,83.00000
2016-01-01 03:30:00,5.200448,83.32584
2016-01-01 03:45:00,5.098876,84.00000
2016-01-01 04:00:00,5.081061,83.76894
2016-01-01 04:15:00,5.230769,82.88664
2016-01-01 04:30:00,5.300000,82.06742
2016-01-01 04:45:00,5.300000,NA
2016-01-01 05:00:00,5.399776,NA

您的代碼對我有用。

但是,您的循環有點浪費,因為它為data.frame的每一列重復計算TIME列的剪切。 您可以對其進行預先計算,但是有更好的解決方案。

您可以通過調用aggregate()以更簡單,更常規,更有用的形式產生相同的結果:

aggregate(df1[names(df1)!='TIME'],list(hour=cut(df1$TIME,'hour')),mean,na.rm=T);
##         hour     Tair       RH
## 1 2016-01-01 5.786316 82.15789
aggregate(df15[names(df15)!='TIME'],list(hour=cut(df15$TIME,'hour')),mean,na.rm=T);
##                  hour     Tair       RH
## 1 2016-01-01 00:00:00 6.141510 80.70429
## 2 2016-01-01 01:00:00 5.836479      NaN
## 3 2016-01-01 02:00:00 5.668362 81.24523
## 4 2016-01-01 03:00:00 5.197762 83.15595
## 5 2016-01-01 04:00:00 5.227957 82.90767
## 6 2016-01-01 05:00:00 5.399776      NaN

數據

df1 <- data.frame(TIME=as.POSIXct(c('2016-01-01 00:01:00','2016-01-01 00:02:00',
'2016-01-01 00:03:00','2016-01-01 00:04:00','2016-01-01 00:05:00','2016-01-01 00:06:00',
'2016-01-01 00:07:00','2016-01-01 00:08:00','2016-01-01 00:09:00','2016-01-01 00:10:00',
'2016-01-01 00:11:00','2016-01-01 00:12:00','2016-01-01 00:13:00','2016-01-01 00:14:00',
'2016-01-01 00:15:00','2016-01-01 00:16:00','2016-01-01 00:17:00','2016-01-01 00:18:00',
'2016-01-01 00:19:00','2016-01-01 00:20:00','2016-01-01 00:21:00','2016-01-01 00:22:00',
'2016-01-01 00:23:00','2016-01-01 00:24:00','2016-01-01 00:25:00','2016-01-01 00:26:00',
'2016-01-01 00:27:00','2016-01-01 00:28:00','2016-01-01 00:29:00','2016-01-01 00:30:00',
'2016-01-01 00:31:00','2016-01-01 00:32:00','2016-01-01 00:33:00','2016-01-01 00:34:00',
'2016-01-01 00:35:00','2016-01-01 00:36:00','2016-01-01 00:37:00','2016-01-01 00:38:00')),
Tair=c(5.9,5.9,5.9,5.89,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.8,5.75,
5.78,5.78,5.8,5.73,5.7,5.7,5.7,5.7,5.7,5.7,5.76,5.8,5.8,5.8,5.8,5.79,5.7),RH=c(82L,82L,82L,
82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,82L,83L,82L,82L,82L,
82L,82L,82L,82L,83L,83L,83L,82L,82L,83L,83L,82L));

df15 <- data.frame(TIME=as.POSIXct(c('2016-01-01 00:15:00','2016-01-01 00:30:00',
'2016-01-01 00:45:00','2016-01-01 01:00:00','2016-01-01 01:15:00','2016-01-01 01:30:00',
'2016-01-01 01:45:00','2016-01-01 02:00:00','2016-01-01 02:15:00','2016-01-01 02:30:00',
'2016-01-01 02:45:00','2016-01-01 03:00:00','2016-01-01 03:15:00','2016-01-01 03:30:00',
'2016-01-01 03:45:00','2016-01-01 04:00:00','2016-01-01 04:15:00','2016-01-01 04:30:00',
'2016-01-01 04:45:00','2016-01-01 05:00:00')),Tair=c(6.228442,6.121088,6.075,5.95191,
5.844144,5.802242,5.747619,5.742889,5.752584,5.677753,5.500224,5.225282,5.266441,5.200448,
5.098876,5.081061,5.230769,5.3,5.3,5.399776),RH=c(80.40858,81,NA,NA,NA,NA,NA,NA,81.12135,81,
81.61435,82.29797,83,83.32584,84,83.76894,82.88664,82.06742,NA,NA));

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM