重新采樣數據不能在所有要求的時間速率下工作

Question

我以1分鍾的采樣率生成時間序列數據

library(xts)
#create timestamp with 1 mintue sampling rage
timerange <- seq(as.POSIXct("2016-06-09"),as.POSIXct("2016-06-22 23:59:59"), by = "1 min")
# create xts object
data_xts <- xts(rnorm(length(timerange),200,5),timerange)

現在，我想重新采樣（更改采樣率）到50分鍾的速率。所以，我創建了一個自定義函數：

resample_data_minutely_daywise <- function(data_xts,xminutes) {
  day_data <- split.xts(data_xts,"days",k=1) # divide data daywise
  # Now resample data according to parameter xminutes
  day_list <- lapply(day_data, function(x) { 
    ds_data <- period.apply(x,INDEX = endpoints(index(x), on = "minutes", k = xminutes ), FUN= mean)
    align_data <- align.time(ds_data,xminutes*60) # aligning to x seconds
    return(align_data)
  })
  return(day_list)
}

此功能將時間序列數據和所需的采樣頻率作為輸入。 接下來，它每天分割數據，最后每天通過取均值來改變采樣。

現在，每當我將此函數稱為

p <- resample_data_minutely_daywise(data_xts,50)
sapply(p,length) # check no. of observations in each day

輸出是：

 sapply(p,length) # check no. of observations in each day
 [1] 30 30 30 29 29 30 30 30 29 29 30 30 30 29

這表明並非每天都包含相同數量的讀數。 幾天包含29個，一些包含30個觀察值。 這種未知行為的原因是什么？ 注意每當我在10,20,30,60分鍾重新采樣時，每天都包含相同數量的讀數。 這個問題只有在我嘗試50分鍾時才會發生。

Answer 1

您的麻煩是period.apply()使用endpoints()來查找中斷的位置，並且endpoints()輸出始終偏離UNIX紀元/起源（1970-01-01 00:00:00）。 但是你希望從當天的午夜開始抵消休息時間。

您仍然可以使用period.apply()執行此period.apply() ，但您需要計算自定義斷點。 在你的情況下，你可以通過尋找從xminutes倍數開始的一天開始的秒來做到這xminutes 。

resample_data_minutely_daywise <- function(data_xts,xminutes) {
  day_data <- split.xts(data_xts,"days",k=1) # divide data daywise
  # Now resample data according to parameter xminutes
  day_list <- lapply(day_data, function(x) { 
    timeT <- .index(x) - .index(x)[1]
    # when does timeT cross a multiple of xminutes?
    ep <- which(timeT %% (xminutes * 60) <= 0)
    # endpoints must start with zero and end with nrow
    ep <- c(0, ep, nrow(x))
    # ...and be unique
    ep <- unique(ep)
    ds_data <- period.apply(x, INDEX = ep, FUN = mean)
    align_data <- align.time(ds_data,xminutes*60) # aligning to x seconds
    return(align_data)
  })
  return(day_list)
}
p <- resample_data_minutely_daywise(data_xts,50)
sapply(p,length)
# [1] 30 30 30 30 30 30 30 30 30 30 30 30 30 30

重新采樣數據不能在所有要求的時間速率下工作

問題描述

1 個解決方案

解決方案1
0 2017-06-19 22:18:33

重新采樣數據不能在所有要求的時間速率下工作

問題描述

1 個解決方案

解決方案1 0 2017-06-19 22:18:33

解決方案1
0 2017-06-19 22:18:33