简体   繁体   English

如何用.lm.fit预测和提取R平方?

[英]How to predict and extract R Squared with .lm.fit?

As the title suggest, I have seen some user mentioned that .lm.fit() functions has an advantage of more speed than a regular lm() , but when i look deeper at .lm.fit() in help, it is supposed to be a fitter functions, it returns a set of list instead of a model , which makes me to think is it still possible to extract components like R squared, Adj R Squared, and lastly do a predict() out of it?正如标题所暗示的那样,我看到一些用户提到.lm.fit()函数具有比常规lm()更快的速度优势,但是当我深入了解.lm.fit()的帮助时,它应该是作为一个更合适的函数,它返回一组列表而不是model ,这让我认为是否仍然可以提取 R 平方、Adj R 平方等组件,最后从中进行predict()

Below is sample data and executions:以下是示例数据和执行:

test_dat <- data.frame(y = rnorm(780, 20, 10))
for(b in 1:300){
  name_var <- paste0("x",b)
  test_dat[[name_var]] <- rnorm(780, 0.01 * b, 5)
}

tic()
obj_lm <- lm(y ~ ., data = test_dat)
print(class(obj_lm))
print(summary(obj_lm)$r.squared)
print(summary(obj_lm)$adj.r.squared)
predict(obj_lm)
toc() #approximately 0.4 seconds

tic()
datm <- as.matrix(test_dat)
obj_lm_fit <- .lm.fit(cbind(1,datm[,-1]), datm[,1])
print(class(obj_lm_fit))
toc() #approximately 0.2 seconds

Functions predict and resid are generic and since .lm.fit returns an object of class "list" , all you have to do is to write methods implementing the definitions of what you want.函数predictresid是通用的,因为.lm.fit返回 object of class "list" ,你所要做的就是编写实现你想要的定义的方法。 Below are methods to compute fitted values, residuals and R^2.以下是计算拟合值、残差和 R^2 的方法。

set.seed(2023)    # make the results reproducible
test_dat <- data.frame(y = rnorm(780, 20, 10))
for(b in 1:300){
  name_var <- paste0("x",b)
  test_dat[[name_var]] <- rnorm(780, 0.01 * b, 5)
}

obj_lm <- lm(y ~ ., data = test_dat)

datm <- as.matrix(test_dat)
obj_lm_fit <- .lm.fit(cbind(1,datm[,-1]), datm[,1])

#------------------------------------------------------------------------
# the methods for objects of class "list"
#
fitted.list <- function(object, X) {
  X %*% coef(object)
}
resid.list <- residuals.list <- function(object, X, y) {
  y_fitted <- fitted(object, X)
  y - y_fitted
}
rsquared <- function(x, ...) UseMethod("rsquared")
rsquared.default <- function(x, ...) {
  summary(x)$r.squared
}
rsquared.list <- function(object, X, y) {
  e <- resid.list(object, X, y)
  1 - sum(e^2)/sum( (y - mean(y))^2 )
}

rsquared(obj_lm_fit, cbind(1,datm[,-1]), datm[,1])
#> [1] 0.3948863
rsquared(obj_lm)
#> [1] 0.3948863

Created on 2023-01-03 with reprex v2.0.2创建于 2023-01-03,使用reprex v2.0.2


Edit 1编辑 1

Added method to also calculate adj.R2添加了计算 adj.R2 的方法

adj_rsquared_list <- function(object, X, y){
  r2 <- rsquared.list(object, X, y)
  k <- ncol(X) - 1
  n <- nrow(X)  
  rate_of_error <- (1 - r2) * (n - 1) / (n - k - 1)
  adj_r2 <- 1 - rate_of_error
  return(adj_r2)
}

adj_rsquared_list(obj_lm_fit, cbind(1,datm[,-1]), datm[,1])
#> [1] 0.01590061

Edit 2编辑 2

After the edit by Jovan , I have changed fitted.list above to use coef() , a function that extracts the first arguments list member "coefficients" , if it exists, and rewrote the default and list methods of rsquared to accept a adj argument.Jovan编辑之后,我将上面的fitted.list更改为使用coef() ,一个 function 提取第一个 arguments 列表成员"coefficients" (如果存在),并重写rsquared的默认和列表方法以接受adj参数. The code to compute the adjusted R^2 is a copy&paste of Jovan's code.计算调整后的 R^2 的代码是 Jovan 代码的复制和粘贴。

rsquared <- function(x, ...) UseMethod("rsquared")
rsquared.default <- function(x, adj = FALSE, ...) {
  if(adj) {
    summary(x)$adj.r.squared
  } else summary(x)$r.squared
}
rsquared.list <- function(object, X, y, adj = FALSE) {
  e <- resid.list(object, X, y)
  r2 <- 1 - sum(e^2)/sum( (y - mean(y))^2 )
  if(adj) {
    k <- ncol(X) - 1
    n <- nrow(X)  
    rate_of_error <- (1 - r2) * (n - 1) / (n - k - 1)
    adj_r2 <- 1 - rate_of_error
    adj_r2
  } else r2
}

# same as above
rsquared(obj_lm_fit, cbind(1,datm[,-1]), datm[,1])
#> [1] 0.3948863
rsquared(obj_lm)
#> [1] 0.3948863

# new, `adj = TRUE`
rsquared(obj_lm_fit, cbind(1,datm[,-1]), datm[,1], adj = TRUE)
#> [1] 0.01590061
rsquared(obj_lm, adj = TRUE)
#> [1] 0.01590061

Created on 2023-01-03 with reprex v2.0.2创建于 2023-01-03,使用reprex v2.0.2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM