Extracting Coefficients, Std Errors, R2 etc from multiple regressions

Question

I have the following regression model;

models <- lapply(1:25, function(x) lm(Y_df[,x] ~ X1))

Which runs 25 regressions on 25 columns in the Y_df dataframe.

One of the outputs can be shown as;

models[15] # Gives me the coefficients for model 15

Call:
lm(formula = Y_df[, x] ~ X1)

Coefficients:
(Intercept)         X1 
  0.1296812    1.0585835

Which I can store in a separate df. The problem I am running into is regarding Std. Error, R2, residules etc.

I would like to store these also into a separate dataframe.

I can run individual regressions and extract the summaries as a normal R regression output would look like.

ls_1 <- summary(models[[1]])
ls_1
ls_1$sigma

However I am hoping to take the values directly from the line of code which runs the 25 regressions.

This code works

> (models[[15]]$coefficients)
  (Intercept)          X1 
-0.3643446787  1.0789369642

However; this code does not.

> (models[[15]]$sigma)
NULL

I have tried a variety of different combinations to try and extract these results with no luck.

The following did exactly what I wanted perfectly. I had hoped there was a way to replace the word coef with Std Error or R2 etc. but this does not work.

models <- lapply(1:25, function(x) lm(Y_df[,x] ~ X1))
# extract just coefficients
coefficients <- sapply(Y_df, coef)

Ideally I would like to store the Std Error from the above model

Answer 1

If a model is named mod, you can get to all of the residuals in the same way as the coefficients:

mod$residuals

There are also functions that extract the coefficients and residuals:

coef(mod)
resid(mod)

The other outputs, you can extract via summary :

summary(mod)$coef[,"Std. Error"]  # standard errors
summary(mod)$r.squared            # r squared
summary(mod)$adj.r.squared        # adjusted r squared

So you can either create a list containing each of these results for each model:

outputList <- lapply(models, function(x){
  coefs <- coef(mod)
  stdErr <- summary(mod)$coef[,"Std. Error"]
  rsq <- summary(mod)$r.squared
  rsq_adj <- summary(mod)$adj.r.squared
  rsd <- resid(mod)
  list(coefs = coefs, 
       stdErr = stdErr, 
       rsq = rsq, 
       rsq_adj = rsq_adj, 
       rsd = rsd)
})

You can then get to the rsq for mod1 via outputList$mod1$rsq , for example.

Or you can create separate dataframes for each:

library(tidyverse)

# coefficients
coefs <- lapply(models, coef) %>%
  do.call(rbind, .) %>%
  as.data.frame() %>% # convert from matrix to dataframe
  rownames_to_column("model") # add original model name as a column in the dataframe

# standard errors
stdErr <- lapply(models, function(x){
  summary(mod)$coef[,"Std. Error"]
}) %>%
  do.call(rbind, .) %>%
  as.data.frame() %>% 
  rownames_to_column("model") 

# r squareds
rsq <- sapply(models, function(x){
  summary(mod)$r.squared
}) %>%
  as.data.frame() %>% 
  rownames_to_column("model")

# adjusted r squareds
rsq_adj <- sapply(models, function(x){
  summary(mod)$adj.r.squared
})%>%
  as.data.frame() %>% 
  rownames_to_column("model")

# residuals
rsd <- lapply(models, resid) %>%
  do.call(rbind, .) %>%
  as.data.frame() %>% 
  rownames_to_column("model")

Worth noting that, if you're in RStudio and you assign the summary to something (ie temp <- summary(mod) ), you can type the name of the object, then "$" and a dropdown of all the other objects that can be extracted from the summary appears.

Extracting Coefficients, Std Errors, R2 etc from multiple regressions

Question

1 answers

solution1
0 2017-12-01 18:14:06

Extracting Coefficients, Std Errors, R2 etc from multiple regressions

Question

1 answers

solution1 0 2017-12-01 18:14:06

solution1
0 2017-12-01 18:14:06