简体   繁体   English

R中加权最小二乘均值估计的加速逆计算

[英]Speed-up inverse calculation of weighted least squares mean estimate in R

I need to speed up the calculation of the mean estimate of beta in a WLS in R - I was able to speed up the covariance calculation thanks to SO , and now I am wondering if there is another trick to also speed up the mean calculation (or if what I am doing is already efficient enough). 我需要加快R中WLS中beta的均值估计的计算- 由于有了SO ,我能够加快协方差计算,现在我想知道是否还有另一个技巧可以加快均值计算(或如果我正在做的事情已经足够有效)。

n = 10000
y = rnorm(n, 3, 0.4)
X = matrix(c(rnorm(n,1,2), sample(c(1,-1), n, replace = TRUE), rnorm(n,2,0.5)), nrow = n, ncol = 3)
Q = diag(rnorm(n, 1.5, 0.3))
wls.cov.matrix = crossprod(X / sqrt(diag(Q)))
Q.inv = diag(1/diag(Q))
wls.mean = wls.cov.matrix%*%t(X)%*%Q.inv%*%y
system.time(wls.cov.matrix%*%t(X)%*%Q.inv%*%y)

Is there another similar trick as in wls.cov.matrix crossprod to speed up the whole mean calculation, or no need? 是否有另一个类似于wls.cov.matrix crossprod中的技巧,以加快整个均值计算的速度,还是不需要? Thanks! 谢谢!

In the answer to your last question you were taught crossprod . 在回答您的最后一个问题时,您已经crossprod Use that function again: 再次使用该功能:

n = 1e4
set.seed(42)
y = rnorm(n, 3, 0.4)
X = matrix(c(rnorm(n,1,2), sample(c(1,-1), n, replace = TRUE), rnorm(n,2,0.5)), nrow = n, ncol = 3)
Q = diag(rnorm(n, 1.5, 0.3))
wls.cov.matrix = crossprod(X / sqrt(diag(Q)))
Q.inv = diag(1/diag(Q))
wls.mean = wls.cov.matrix%*%t(X)%*%Q.inv%*%y
wls.mean2 <- wls.cov.matrix %*% crossprod(X, Q.inv) %*% y
all.equal(wls.mean, wls.mean2)
#[1] TRUE

library(microbenchmark)
microbenchmark(wls.cov.matrix %*% t(X) %*% Q.inv %*% y,
               wls.cov.matrix %*% crossprod(X, Q.inv) %*% y,
               times=5)

#Unit: milliseconds
#                                        expr       min        lq    median       uq       max neval
#     wls.cov.matrix %*% t(X) %*% Q.inv %*% y 1019.3955 1022.1679 1022.2766 1024.540 1025.9131     5
#wls.cov.matrix %*% crossprod(X, Q.inv) %*% y  314.0622  315.3588  315.3933  317.024  317.1142     5

More performance improvements might be possible with some matrix algebra tricks, but that is not my forte. 通过一些矩阵代数技巧,可能还会有更多的性能改进,但这不是我的专长。

The main gain in the performance would be achieved by not having any n-by-n matrices along the way. 性能上的主要增益将通过沿途没有任何n×n矩阵来实现。 I mean not having the Q matrix, only working with its diagonal. 我的意思是没有Q矩阵,仅使用对角线。

Building on the answer by @Roland: 以@Roland的答案为基础:

Qdiag = rnorm(n, 1.5, 0.3);
Q = diag(Qdiag);

wls.mean3 <- wls.cov.matrix %*% crossprod(X/Qdiag, y);
all.equal(wls.mean, wls.mean3)
microbenchmark(wls.cov.matrix %*% t(X) %*% Q.inv %*% y,
               wls.cov.matrix %*% crossprod(X, Q.inv) %*% y,
               wls.cov.matrix %*% crossprod(X/Qdiag, y),
           times=5)
#      wls.cov.matrix %*% t(X) %*% Q.inv %*% y 358050.195 363713.250 368820.818 372414.747 374824.56     5
# wls.cov.matrix %*% crossprod(X, Q.inv) %*% y  79449.856  81411.195  84616.706  85351.968  88108.62     5
#     wls.cov.matrix %*% crossprod(X/Qdiag, y)    279.092    284.867    285.252    291.796    295.26     5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM