简体   繁体   English

获得由`​​lm()`返回的“mlm”对象的回归系数的t统计量

[英]Obtain t-statistic for regression coefficients of an “mlm” object returned by `lm()`

I've used lm() to fit multiple regression models, for multiple (~1 million) response variables in R. Eg. 我已经used lm()来拟合多个回归模型,用于R中的多个(~100万个)响应变量。例如。

allModels <- lm(t(responseVariablesMatrix) ~ modelMatrix)

This returns an object of class "mlm", which is like a huge object containing all the models. 这将返回类“mlm”的对象,它类似于包含所有模型的巨大对象。 I want to get the t-statistic for the first coefficient in each model, which I can do using the summary(allModels) function, but its very slow on this large data and returns a lot of unwanted info too. 我想获得每个模型中第一个系数的t统计量 ,我可以使用summary(allModels)函数来完成,但是它在这个大数据上非常慢并且还返回了许多不需要的信息。

Is there a faster way of calculating the t-statistic manually, that might be faster than using the summary() function 是否有更快的方法手动计算t-statistic ,这可能比使用summary()函数更快

Thanks! 谢谢!

You can hack the summary.lm() function to get just the bits you need and leave the rest. 你可以破解summary.lm()函数来获得你需要的位,剩下的就剩下了。

If you have 如果你有

nVariables <- 5
nObs <- 15

y <- rnorm(nObs)
x <- matrix(rnorm(nVariables*nObs),nrow=nObs)

allModels <-lm(y~x)

Then this is the code from the lm.summary() function but with all the excess baggage removed (note, all the error handling has been removed as well). 然后这是来自lm.summary()函数的代码,但删除了所有多余的行李(注意,所有错误处理也已被删除)。

p <- allModels$rank
rdf <- allModels$df.residual
Qr <- allModels$qr
n <- NROW(Qr$qr)
p1 <- 1L:p
r <- allModels$residuals
f <- allModels$fitted.values
w <- allModels$weights
mss <- if (attr(allModels$terms, "intercept")) 
sum((f - mean(f))^2) else sum(f^2)
rss <- sum(r^2)
resvar <- rss/rdf
R <- chol2inv(Qr$qr[p1, p1, drop = FALSE])
se <- sqrt(diag(R) * resvar)
est <- allModels$coefficients[Qr$pivot[p1]]
tval <- est/se

tval is now a vector of the t statistics as also give by tval现在是t统计量的向量,也是由

summary(allModels)$coefficients[,3]

If you have problems on the large model you might want to rewrite the code so that it keeps fewer objects by compounding multiple lines/assignments into fewer lines. 如果您在大型模型上遇到问题,您可能需要重写代码,以便通过将多行/分配复合到更少的行中来保留更少的对象。

Hacky solution I know. Hacky解决方案我知道。 But it will be about as fast as possible. 但它将尽可能快。 I suppose it would be neater to put all the lines of code into a function as well. 我想将所有代码行放入函数中也是比较简洁的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM