简体   繁体   中英

R Multiple Regression Loop and Extract Coefficients

I have to perform multiple linear regression for many vectors of dependent variables on the same matrix of independent variables.

For example, I want to create 3 models such that:

lm( d ~ a + b + c )
lm( e ~ a + b + c )
lm( f ~ a + b + c )

from the following matrix (a,b,c are the independent variables and d,e,f are the dependent variables)

       [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,]    a1       b1       c1       d1       e1       f1
[2,]    a2       b2       c2       d2       e2       f2
[3,]    a3       b3       c3       d3       e3       f3

I then want to store the coefficients from the regression in another matrix (I have reduced the number of columns and vectors in my example for ease of explanation).

Here's a method that is not very general, but will work if you substitute your own dependent variable names in depvar , and of course the independent variables common to all models in the inner lm() call, and of course the dataset name. Here I have demonstrated on mtcars , a built-in dataset supplied with R.

depvar <- c("mpg", "disp", "qsec")
regresults <- lapply(depvar, function(dv) {
    tmplm <- lm(get(dv) ~ cyl + hp + wt, data = mtcars)
    coef(tmplm)
})
# returns a list, where each element is a vector of coefficients
# do.call(rbind, ) will paste them together
allresults <- data.frame(depvar = depvar, 
                         do.call(rbind, regresults))
# tidy up name of intercept variable
names(allresults)[2] <- "intercept"
allresults
##   depvar  intercept        cyl          hp        wt
## 1    mpg   38.75179 -0.9416168 -0.01803810 -3.166973
## 2   disp -179.04186 30.3212049  0.21555502 59.222023
## 3   qsec   19.76879 -0.5825700 -0.01881199  1.381334

Edit based on suggestion by @Mike Wise:

If you want only a numeric dataset but want to keep the identifier, you can add it as a row.name, like this:

allresults <- data.frame(do.call(rbind, regresults),
                         row.names = depvar)
# tidy up name of intercept variable
names(allresults)[1] <- "intercept"
allresults
##       intercept        cyl          hp        wt
## mpg    38.75179 -0.9416168 -0.01803810 -3.166973
## disp -179.04186 30.3212049  0.21555502 59.222023
## qsec   19.76879 -0.5825700 -0.01881199  1.381334

I actually recently encountered the same issue and a quick and easy way to go about it is to simply manually add all the results to a dataframe with the coefficients function.

coeffdf <- data.frame(coefficients(lm1),coefficients(lm2))

It will work well if you have the same variables for each regression.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM