R：有效地将具有最大互相关的时间序列段定位到输入段？

Question

我有一个大约200,000行的长数值时间序列数据（我们称之为Z ）。

在循环中，我一次从Z中对x （约30）个连续行进行子集化，并将它们视为查询点q 。

我想在Z中定位y （~300）长度为x的最相关的时间序列段 （与q最相关）。

有效的方法是什么？

Answer 1

下面的代码找到了你正在寻找的300个细分，并且在我的功能非常强大的Windows笔记本电脑上运行8秒钟，所以它应该足够快到达你的目的。

首先，它构造了一个30×by-199971矩阵（ Zmat ），其列包含您要检查的所有长度为30的“时间序列段”。 对矢量q和矩阵Zmat单次调用cor() ，然后计算所有所需的相关系数。 最后，检查所得的矢量以识别具有最高相关系数的300个序列。

# Simulate data
nZ <- 200000
nq <- 30
Z <- rnorm(nZ)
q <- seq_len(nq)

# From Z, construct a 30 by 199971 matrix, in which each column is a
# "time series segment". Column 1 contains observations 1:30, column 2
# contains observations 2:31, and so on through the end of the series.
Zmat <- sapply(seq_len(nZ - nq + 1),  
               FUN = function(X) Z[seq(from = X, length.out = nq)])

# Calculate the correlation of q with every column/"time series segment.
Cors <- cor(q, Zmat)

# Extract the starting position of the 300 most highly correlated segments    
ids <- order(Cors, decreasing=TRUE)[1:300]

# Maybe try something like the following to confirm that you have
# selected the most highly correlated segments.
hist(Cors, breaks=100)
hist(Cors[ids], col="red", add=TRUE)

Answer 2

天真的解决方案确实很慢（至少几分钟 - 我不够耐心）：

library(zoo)
n <- 2e5
k <- 30
z <- rnorm(n)
x <- rnorm(k) # We do not use the fact that x is a part of z
rollapply(z, k, function(u) cor(u,x), align="left")

您可以从最初的时刻和小组中手动计算相关性，但仍需要几分钟。

y <- zoo(rnorm(n), 1:n)
x <- rnorm(k)
exy <- exx <- eyy <- ex <- ey <- zoo( rep(0,n), 1:n )
for(i in 1:k) {
  cat(i, "\n")
  exy <- exy + lag(y,i-1) * x[i]
  ey  <- ey  + lag(y,i-1) 
  eyy <- eyy + lag(y,i-1)^2 
  ex  <- ex  + x[i]    # Constant time series
  exx <- exx + x[i]^2  # Constant time series
}
exy <- exy/k
ex <- ex/k
ey <- ey/k
exx <- exx/k
eyy <- eyy/k
covxy <- exy - ex * ey
vx <- exx - ex^2
vy <- eyy - ey^2
corxy <- covxy / sqrt( vx * vy )

一旦你有相关的时间序列，很容易提取前300的位置。

i <- order(corxy, decreasing=TRUE)[1:300]
corxy[i]

R：有效地将具有最大互相关的时间序列段定位到输入段？

问题描述

2 个解决方案

解决方案1
5 已采纳 2012-02-05 00:49:54

解决方案2
3 2012-02-02 06:37:32

R：有效地将具有最大互相关的时间序列段定位到输入段？

问题描述

2 个解决方案

解决方案1 5 已采纳 2012-02-05 00:49:54

解决方案2 3 2012-02-02 06:37:32

解决方案1
5 已采纳 2012-02-05 00:49:54

解决方案2
3 2012-02-02 06:37:32