I developed an R code that requires to compute more than 10 million times the same multiple regression model (15 variables). I need to extract for each model the obtained t -statistic for only one independent variable. I need to apply for each model a heteroskedasticity-consistent covariance matrix estimator and I am doing that using the White's estimator via coeftest
and vcovHC
, but I noticed that this operation increases a lot the required computational time of the simulation. Therefore I was wondering if there could be a way to speed up the code as I only need the t -statistic of the second variable.
Below is a toy example of what I am doing at each iteration:
model <- lm(y ~ a + b + c, data = data)
model <- coeftest(model, vcov. = vcovHC(model, type = "HC"))
t[i] <- summary(MUR)$coef[2, 3]
The variables involved are always the same, but I am permuting their values randomly. In other words I am permuting the model matrix X .
It is possible to reduce computation by ~30 % handling heteroscedasticity-corrected covariance matrices using hccm
function of car
package then extracting t -values directly. Please see the simulation below:
library(lmtest)
library(sandwich)
library(microbenchmark)
library(ggplot2)
library(car)
microbenchmark(
hccm = {
m <- lm(cty ~ displ + cyl, data = mpg)
V <- hccm(m, "hc0")
cfs <- m$coefficients
ses <- sqrt(diag(V))
cfs / ses
},
coeftest = {
m <- lm(cty ~ displ + cyl, data = mpg)
coeftest(m, vcov. = vcovHC(m, type = "HC0"))
}
)
Output:
Unit: milliseconds
expr min lq mean median uq max neval cld
hccm 1.695146 1.777919 1.939631 1.822293 1.891840 10.65045 100 a
coeftest 2.557013 2.650025 2.735701 2.684586 2.764373 3.37536 100 b
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.