.lm.fit is considerably faster than lm for reasons documented in several places, but it is not as straight forward to get an adjusted r-squared value so I'm hoping for some help.
Using lm() and then summary() to get the adjusted r-squared.
tstlm <- lm(cyl ~ hp + wt, data = mtcars)
summary(tstlm)$adj.r.squared
Using.lm.fit
mtmatrix <- as.matrix(mtcars)
tstlmf <- .lm.fit(cbind(1,mtmatrix [,c("hp","wt")]), mtmatrix [,"cyl"])
And here I'm stuck. I suspect the information I need to calculate adjusted r-squared is found in the.lm.fit model somewhere but I can't quite figure out how to proceed.
Thanks in advance for any suggestions.
1) R squared equals the squared correlation between the dependent variable and the fitted values. We can get the residuals from tstlmf using resid(tstslmf) and the fitted values equal y minus those residuals.
Adjusted R squared is formed by multiplying R squared by an expression using only the number of rows and columns of X.
Note that the formulas would change if there is no intercept.
X <- with(mtcars, cbind(1, hp, wt))
y <- mtcars$cyl
testlmf <- .lm.fit(X, y)
rsq <- cor(y, y - resid(tstlmf))^2; rsq
## [1] 0.7898
adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753
# check
tstlm <- lm(cyl ~ hp + wt, mtcars)
s <- summary(tstlm)
s$r.squared
## [1] 0.7898
s$adj.r.squared
## [1] 0.7753
2) R squared can alternately be calculated as the ratio var(fitted) / var(y) as in the link above and in that case we write:
testlmf <- .lm.fit(X, y)
rsq <- var(y - resid(tstlmf)) / var(y); rsq
## [1] 0.7898
adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753
flm in the collapse package may be slightly faster than.lm.fit. It returns the coefficients only.
library(collapse)
tstflm <- flm(y, X)
rsq <- c(cor(y, X %*% tstflm)^2); rsq
## [1] 0.7898
adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753
or
tstflm <- flm(y, X)
rsq <- var(X %*% tstflm) / var(y); rsq
## [1] 0.7898
adj <- 1 - (1-rsq) * (nrow(X) - 1) / -diff(dim(X)); adj
## [1] 0.7753
The following function computes the adjusted R2 from an object returned by .lm.fit
and the response vector y
.
adj_r2_lmfit <- function(object, y){
ypred <- y - resid(object)
mss <- sum((ypred - mean(ypred))^2)
rss <- sum(resid(object)^2)
rdf <- length(resid(object)) - object$rank
r.squared <- mss/(mss + rss)
adj.r.squared <- 1 - (1 - r.squared)*(NROW(y) - 1)/rdf
adj.r.squared
}
tstlm <- lm(cyl ~ hp + wt, data = mtcars)
tstlmf <- .lm.fit(cbind(1,mtmatrix [,c("hp","wt")]), mtmatrix [,"cyl"])
summary(tstlm)$adj.r.squared
#[1] 0.7753073
adj_r2_lmfit(tstlmf, mtmatrix [,"cyl"])
#[1] 0.7753073
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.