R中加权最小二乘均值估计的加速逆计算

Question

I need to speed up the calculation of the mean estimate of beta in a WLS in R - I was able to speed up the covariance calculation thanks to SO , and now I am wondering if there is another trick to also speed up the mean calculation (or if what I am doing is already efficient enough). 我需要加快R中WLS中beta的均值估计的计算- 由于有了SO ，我能够加快协方差计算，现在我想知道是否还有另一个技巧可以加快均值计算（或如果我正在做的事情已经足够有效）。

n = 10000
y = rnorm(n, 3, 0.4)
X = matrix(c(rnorm(n,1,2), sample(c(1,-1), n, replace = TRUE), rnorm(n,2,0.5)), nrow = n, ncol = 3)
Q = diag(rnorm(n, 1.5, 0.3))
wls.cov.matrix = crossprod(X / sqrt(diag(Q)))
Q.inv = diag(1/diag(Q))
wls.mean = wls.cov.matrix%*%t(X)%*%Q.inv%*%y
system.time(wls.cov.matrix%*%t(X)%*%Q.inv%*%y)

Is there another similar trick as in wls.cov.matrix crossprod to speed up the whole mean calculation, or no need? 是否有另一个类似于wls.cov.matrix crossprod中的技巧，以加快整个均值计算的速度，还是不需要？ Thanks! 谢谢！

Answer 1

In the answer to your last question you were taught crossprod . 在回答您的最后一个问题时，您已经crossprod 。 Use that function again: 再次使用该功能：

n = 1e4
set.seed(42)
y = rnorm(n, 3, 0.4)
X = matrix(c(rnorm(n,1,2), sample(c(1,-1), n, replace = TRUE), rnorm(n,2,0.5)), nrow = n, ncol = 3)
Q = diag(rnorm(n, 1.5, 0.3))
wls.cov.matrix = crossprod(X / sqrt(diag(Q)))
Q.inv = diag(1/diag(Q))
wls.mean = wls.cov.matrix%*%t(X)%*%Q.inv%*%y
wls.mean2 <- wls.cov.matrix %*% crossprod(X, Q.inv) %*% y
all.equal(wls.mean, wls.mean2)
#[1] TRUE

library(microbenchmark)
microbenchmark(wls.cov.matrix %*% t(X) %*% Q.inv %*% y,
               wls.cov.matrix %*% crossprod(X, Q.inv) %*% y,
               times=5)

#Unit: milliseconds
#                                        expr       min        lq    median       uq       max neval
#     wls.cov.matrix %*% t(X) %*% Q.inv %*% y 1019.3955 1022.1679 1022.2766 1024.540 1025.9131     5
#wls.cov.matrix %*% crossprod(X, Q.inv) %*% y  314.0622  315.3588  315.3933  317.024  317.1142     5

More performance improvements might be possible with some matrix algebra tricks, but that is not my forte. 通过一些矩阵代数技巧，可能还会有更多的性能改进，但这不是我的专长。

Answer 2

The main gain in the performance would be achieved by not having any n-by-n matrices along the way. 性能上的主要增益将通过沿途没有任何n×n矩阵来实现。 I mean not having the Q matrix, only working with its diagonal. 我的意思是没有Q矩阵，仅使用对角线。

Building on the answer by @Roland: 以@Roland的答案为基础：

Qdiag = rnorm(n, 1.5, 0.3);
Q = diag(Qdiag);

wls.mean3 <- wls.cov.matrix %*% crossprod(X/Qdiag, y);
all.equal(wls.mean, wls.mean3)
microbenchmark(wls.cov.matrix %*% t(X) %*% Q.inv %*% y,
               wls.cov.matrix %*% crossprod(X, Q.inv) %*% y,
               wls.cov.matrix %*% crossprod(X/Qdiag, y),
           times=5)
#      wls.cov.matrix %*% t(X) %*% Q.inv %*% y 358050.195 363713.250 368820.818 372414.747 374824.56     5
# wls.cov.matrix %*% crossprod(X, Q.inv) %*% y  79449.856  81411.195  84616.706  85351.968  88108.62     5
#     wls.cov.matrix %*% crossprod(X/Qdiag, y)    279.092    284.867    285.252    291.796    295.26     5

R中加权最小二乘均值估计的加速逆计算

问题描述

2 个解决方案

解决方案1
2 2014-06-13 15:39:02

解决方案2
2 已采纳 2014-06-13 16:22:15

R中加权最小二乘均值估计的加速逆计算

问题描述

2 个解决方案

解决方案1 2 2014-06-13 15:39:02

解决方案2 2 已采纳 2014-06-13 16:22:15

解决方案1
2 2014-06-13 15:39:02

解决方案2
2 已采纳 2014-06-13 16:22:15