計算R中時間序列的每日模式

Question

我正在嘗試計算此時間序列的每日模式。 在下面的示例數據中，我想每天查看windDir.c列的模式。

鑒於沒有“ colMode”參數，因此不知道如何使用apply.daily()包裝器。 因此，我嘗試在period.apply()使用自定義函數，但無濟於事。 我嘗試過的代碼以及dput如下。

ep <- endpoints(wind.d,'days') 
modefunc <- function(x) {
  tabresult <- tabulate(x)
  themode <- which(tabresult == max(tabresult))
  if (sum(tabresult == max(tabresult))>1)
    themode <- NA
  return(themode)
}

period.apply(wind.d$windDir.c, INDEX=ep, FUN=function(x) mode(x))

可復制的數據：

wind.d <- structure(list(date = structure(c(1280635200, 1280635200, 1280635200, 
1280635200, 1280635200, 1280635200, 1280635200, 1280721600, 1280721600, 
1280721600, 1280721600, 1280721600, 1280721600, 1280721600, 1280808000, 
1280808000, 1280808000, 1280808000, 1280808000, 1280808000), class = c("POSIXct", 
"POSIXt"), tzone = ""), windDir.c = structure(c(4L, 3L, 3L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 6L, 5L, 5L, 4L, 5L, 5L
), .Label = c("15", "45", "75", "105", "135", "165", "195", "225", 
"255", "285", "315", "345"), class = "factor")), .Names = c("date", 
"windDir.c"), class = "data.frame", row.names = c(NA, -20L))

Answer 1

我們可以使用dplyr輕松地做到這dplyr ：

library(dplyr)
wind.d %>% group_by(date, windDir.c) %>%
           summarise(count = n()) %>%
           summarise(mode = windDir.c[which.max(count)])

Answer 2

或基數R：

 calMode <- function(x) {
   ux <- unique(x)
   return(ux[which.max(tabulate(match(x, ux)))])
 }
 myModes <- tapply(as.character(windDir.c), INDEX = date, FUN = calMode)

Answer 3

請注意，您嘗試的代碼與您提供的dput的輸出不一致。 dput輸出不是xts對象，您提供的代碼僅適用於xts對象（ endpoints在您提供的data.frame上失敗）。

假設wind.d確實是xts對象，則可以使用xts輕松完成此操作：

wind.d <- structure(c(105, 75, 75, 105, 105, 105, 105, 105, 105, 105, 105, 
  105, 135, 135, 165, 135, 135, 105, 135, 135), .Dim = c(20L, 1L),
  index = structure(c(1280635200, 1280635200, 1280635200, 1280635200, 
  1280635200, 1280635200, 1280635200, 1280721600, 1280721600, 1280721600, 
  1280721600, 1280721600, 1280721600, 1280721600, 1280808000, 1280808000, 
  1280808000, 1280808000, 1280808000, 1280808000), tzone = "",
  tclass = c("POSIXct", "POSIXt")), .indexCLASS = c("POSIXct", "POSIXt"),
  tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "",
  .Dimnames = list(NULL, "windDir.c"), class = c("xts", "zoo"))
apply.daily(x, function(x) which.max(tabulate(x)))
#                     windDir.c
# 2010-07-31 23:00:00       105
# 2010-08-01 23:00:00       105
# 2010-08-02 23:00:00       135

Answer 4

我們可以最modeest地加載軟件包以使用函數mfv （最常值）

library(dplyr)
library(modeest)
wind.d %>% group_by(date) %>% summarise(mode = mfv(windDir.c))

輸出：

                 date mode
1 2010-08-01 06:00:00  105
2 2010-08-02 06:00:00  105
3 2010-08-03 06:00:00  135

如果存在多種模式，則需要指定要檢索的元素。 否則將返回錯誤。 例如，第一個元素：

mfv(iris[iris$Species=="setosa", 1])
[1] 5.0 5.1
# dplyr
iris %>% group_by(Species) %>% summarise(mode = mfv(Sepal.Length)[1]) 
     Species mode
1     setosa  5.0
2 versicolor  5.5
3  virginica  6.3

sqldf

對於那些對sqldf感興趣的sqldf ，請使用以下方法：

library(sqldf)
sqldf("SELECT date, 
            (SELECT [windDir.c]
            FROM [wind.d] 
            WHERE date = tbl.date
            GROUP BY [windDir.c] 
            ORDER BY count(*) DESC
            LIMIT 1) AS mode
      FROM (SELECT DISTINCT date
            FROM [wind.d]) AS tbl")

計算R中時間序列的每日模式

問題描述

4 個解決方案

解決方案1
1 已采納 2015-08-19 22:04:00

解決方案2
1 2015-08-19 22:15:32

解決方案3
1 2015-08-20 05:17:52

解決方案4
1 2015-08-20 09:05:43

sqldf

計算R中時間序列的每日模式

問題描述

4 個解決方案

解決方案1 1 已采納 2015-08-19 22:04:00

解決方案2 1 2015-08-19 22:15:32

解決方案3 1 2015-08-20 05:17:52

解決方案4 1 2015-08-20 09:05:43

sqldf

解決方案1
1 已采納 2015-08-19 22:04:00

解決方案2
1 2015-08-19 22:15:32

解決方案3
1 2015-08-20 05:17:52

解決方案4
1 2015-08-20 09:05:43