简体   繁体   English

快速调整的 r 平方提取

[英]Fast adjusted r-squared extraction

.lm.fit is considerably faster than lm for reasons documented in several places, but it is not as straight forward to get an adjusted r-squared value so I'm hoping for some help. .lm.fit 比 lm 快得多,原因在几个地方都有记录,但获得调整后的 r 平方值并不那么直接,所以我希望得到一些帮助。

Using lm() and then summary() to get the adjusted r-squared.使用 lm() 然后使用 summary() 获得调整后的 r 平方。

tstlm <- lm(cyl ~ hp + wt, data = mtcars)

summary(tstlm)$adj.r.squared

Using.lm.fit使用.lm.fit

mtmatrix <- as.matrix(mtcars)

tstlmf <- .lm.fit(cbind(1,mtmatrix [,c("hp","wt")]), mtmatrix [,"cyl"])

And here I'm stuck.在这里我被困住了。 I suspect the information I need to calculate adjusted r-squared is found in the.lm.fit model somewhere but I can't quite figure out how to proceed.我怀疑我需要计算调整后的 r 平方的信息可以在 .lm.fit model 某处找到,但我不太清楚如何继续。

Thanks in advance for any suggestions.在此先感谢您的任何建议。

1) R squared equals the squared correlation between the dependent variable and the fitted values. 1) R 平方等于因变量和拟合值之间相关性的平方 We can get the residuals from tstlmf using resid(tstslmf) and the fitted values equal y minus those residuals.我们可以使用 resid(tstslmf) 从 tstlmf 获得残差,并且拟合值等于 y 减去这些残差。

Adjusted R squared is formed by multiplying R squared by an expression using only the number of rows and columns of X. 调整后的 R 平方是通过将 R 的平方乘以仅使用 X 的行数和列数的表达式得到的。

Note that the formulas would change if there is no intercept.请注意,如果没有截距,公式会发生变化。

X <- with(mtcars, cbind(1, hp, wt))
y <- mtcars$cyl

testlmf <- .lm.fit(X, y)

rsq <- cor(y, y - resid(tstlmf))^2; rsq
## [1] 0.7898

adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753


# check
tstlm <- lm(cyl ~ hp + wt, mtcars)
s <- summary(tstlm)
s$r.squared
## [1] 0.7898
s$adj.r.squared
## [1] 0.7753

2) R squared can alternately be calculated as the ratio var(fitted) / var(y) as in the link above and in that case we write: 2) R 的平方也可以按照上面的链接中的比率 var(fitted) / var(y) 计算,在这种情况下我们写:

testlmf <- .lm.fit(X, y)

rsq <- var(y - resid(tstlmf)) / var(y); rsq
## [1] 0.7898

adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753

collapse坍塌

flm in the collapse package may be slightly faster than.lm.fit. flm在崩溃package可能比.lm.fit稍微快一点。 It returns the coefficients only.它只返回系数。

library(collapse)

tstflm <- flm(y, X)
rsq <- c(cor(y, X %*% tstflm)^2); rsq
## [1] 0.7898
adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753

or要么

tstflm <- flm(y, X)

rsq <- var(X %*% tstflm) / var(y); rsq
## [1] 0.7898
adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753

The following function computes the adjusted R2 from an object returned by .lm.fit and the response vector y .以下 function 根据 .lm.fit 返回的.lm.fit和响应向量y计算调整后的 R2。

adj_r2_lmfit <- function(object, y){
  ypred <- y - resid(object)
  mss <- sum((ypred - mean(ypred))^2)
  rss <- sum(resid(object)^2)
  rdf <- length(resid(object)) - object$rank
  r.squared <- mss/(mss + rss)
  adj.r.squared <- 1 - (1 - r.squared)*(NROW(y) - 1)/rdf
  adj.r.squared
}

tstlm <- lm(cyl ~ hp + wt, data = mtcars)
tstlmf <- .lm.fit(cbind(1,mtmatrix [,c("hp","wt")]), mtmatrix [,"cyl"])

summary(tstlm)$adj.r.squared
#[1] 0.7753073
adj_r2_lmfit(tstlmf, mtmatrix [,"cyl"])
#[1] 0.7753073

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM