繁体   English   中英

R中的子集xts时间序列对象

[英]Subset xts time-series object in R

我有像这样某些月份的时间序列xts对象

library(xts)
  seq<- seq(as.POSIXct("2015-09-01"),as.POSIXct("2015-09-04"), by = "30 mins")
  ob<- xts(data.frame(power=1:(length(seq))),seq)

现在,对应于每个观察(比如A ),我想计算过去两个小时观察的平均值。 因此,对应于每个观察( A ),我需要计算在A两小时之前发生的观察的索引,假设它是B 然后我可以计算AB之间观测值的平均值。 因此,

i=10 # dummy
ind_cur<- index(ob[i,]) # index of current observation
ind_back <- ind_cur - 3600 * 2 # index of 2 hours back observation

使用这些指数,我将ob子集为

 ob['ind_cur/ind_back']

它导致以下错误:

Error in if (length(c(year, month, day, hour, min, sec)) == 6 && c(year,  : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In as_numeric(YYYY) : NAs introduced by coercion
2: In as_numeric(MM) : NAs introduced by coercion
3: In as_numeric(DD) : NAs introduced by coercion
4: In as_numeric(YYYY) : NAs introduced by coercion
5: In as_numeric(MM) : NAs introduced by coercion
6: In as_numeric(DD) : NAs introduced by coercion

谁能帮我对ob进行子集化! 链接上找到了一个相关的问题,但不足以解决这个问题。

更新预期输出显示为

2015-09-01 00:00:00     1   NA # as I don't have previous data
2015-09-01 00:30:00     2   NA
2015-09-01 01:00:00     3   NA
2015-09-01 01:30:00     4   NA
2015-09-01 02:00:00     5   10/4 # mean of prevous 4 observations (last two hours)
2015-09-01 02:30:00     6   14/4  
2015-09-01 03:00:00     7   18/4

这是一个一般很难解决的问题,因此您需要推出自己的解决方案。 最简单的方法是通过重叠 2 小时间隔使用window来进行子集化。

# initialize a result object
ob2 <- ob * NA_real_
# loop over all rows and calculate 2-hour mean
for(i in 2:nrow(ob)) {
  ix <- index(ob)[i]
  ob2[i] <- mean(window(ob, start=ix-3600*2, end=ix))
}
# set incomplete 2-hour intervals to NA
is.na(ob2) <- which(index(ob2) < start(ob2)+3600*2)

我们可以将rollapply()包与lag()结合使用,以将产生的滚动mean偏移一行。

rollapply(lag(ob), 4, mean)
#                    power
#2015-09-01 00:00:00    NA
#2015-09-01 00:30:00    NA
#2015-09-01 01:00:00    NA
#2015-09-01 01:30:00    NA
#2015-09-01 02:00:00   2.5
#2015-09-01 02:30:00   3.5
#2015-09-01 03:00:00   4.5

# Or if you want it as new variable in your xts object
ob$mean <- rollapply(lag(ob),4,mean)

基于对“预期输出”问题的更新和 RS 的评论:

library(TTR)
head(SMA(ob$power, 4))  # 2 hour moving average

结果

                    SMA
2015-09-01 00:00:00  NA
2015-09-01 00:30:00  NA
2015-09-01 01:00:00  NA
2015-09-01 01:30:00 2.5
2015-09-01 02:00:00 3.5
2015-09-01 02:30:00 4.5

这假设有问题所述的 30 分钟间隔。

要看起来更像预期输出:

lag(head(SMA(ob$power, 4),7))

                    SMA
2015-09-01 00:00:00  NA
2015-09-01 00:30:00  NA
2015-09-01 01:00:00  NA
2015-09-01 01:30:00  NA
2015-09-01 02:00:00 2.5
2015-09-01 02:30:00 3.5
2015-09-01 03:00:00 4.5

data.table提供了滚动功能,对单个和多个时间序列都很有用:

head(

    as.data.table(ob)[, roll_power := frollmean(power, 4, align = 'right')]
)

# at the end of a 4 1/2 hour lag

                 index power roll_power
1: 2015-09-01 00:00:00     1         NA
2: 2015-09-01 00:30:00     2         NA
3: 2015-09-01 01:00:00     3         NA
4: 2015-09-01 01:30:00     4        2.5 # the rolling mean covers this, and preceding rows
5: 2015-09-01 02:00:00     5        3.5
6: 2015-09-01 02:30:00     6        4.5

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM