I am working with R.
I have a matrix called combination:
comb <- matrix( c(1,2,1,3,2,3) , nrow = 3 , ncol = 2)
n_comb<-3
I have a one column dataframe called y with the values of my y variable.
I have a 3 column dataframe called reg with 3 regressors.
I want to do a loop which regresses y on all possible combinations of reg, selecting each time two variables. Hopefully, I can store the values of the regression somewhere so that I can easily access them afterwards. For instance, I would like to store the R square of each regression, as well as the x variables employed associated with the R square value.
So far I have tried:
for (i in 1:n_comb){
*reg_simple <- select only the variables I need*
all<-cbind (y,reg_simple)
colnames(all)[1] <- "y"
regression <-lm(y~.,all)
summary (regression)
*store the R square and the regressors somewhere*
}
`
If we wanted to use the predictors based on each row of the 'comb', loop over the rows of the 'comb' matrix (either with apply/MARGIN = 1
or split by row ( asplit
- MARGIN = 1
) and loop with sapply
), create the formula using reformulate
, apply the lm
, and extract the r.squared
values
rsquare_out <- sapply(asplit(comb, 1),
function(i) summary(lm(reformulate(names(reg)[i], response = 'y'),
data = cbind(reg, y)))$r.squared)
Using loops:
Dummy data:
n = 100
y = rnorm(n)
x = data.frame(x1=1*y+rnorm(n),
x2=2*y+rnorm(n),
x3=3*y+rnorm(n))
comb = gtools::combinations(3, 2)
Code:
regs = list()
for(i in 1:nrow(comb)){
mod = summary(lm(y ~ ., x[,comb[i,]]))
regs[[i]] = list(call=mod$terms[[3]],
coefs=mod$coefficients,
RS=mod$r.squared)}
You can include anything else you want in the list()
. Output:
> regs
[[1]]
[[1]]$call
x1 + x2
[[1]]$coefs
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03686327 0.04218032 0.8739449 3.843069e-01
x1 0.13359822 0.04037758 3.3087228 1.316050e-03
x2 0.36019362 0.02384050 15.1084733 3.143002e-27
[[1]]$RS
[1] 0.8384476
[[2]]
[[2]]$call
x1 + x3
[[2]]$coefs
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03390277 0.02660885 1.274116 2.056664e-01
x1 0.04295226 0.02654442 1.618128 1.088823e-01
x3 0.28556167 0.01064096 26.836090 1.110231e-46
[[2]]$RS
[1] 0.9356962
[[3]]
[[3]]$call
x2 + x3
[[3]]$coefs
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0291651 0.02391629 1.219466 2.256244e-01
x2 0.1116096 0.02205835 5.059746 1.989407e-06
x3 0.2304448 0.01497633 15.387271 8.944792e-28
[[3]]$RS
[1] 0.9477506
Or you can use this to name the lists with the call:
regs = list()
for(i in 1:nrow(comb)){
names = colnames(x)[comb[i,]]
mod = summary(lm(y ~ ., x[,names]))
regs[[paste(names, collapse=" + ")]] = list(coefs=mod$coefficients,
RS=mod$r.squared)}
Output:
> regs
$`x1 + x2`
$`x1 + x2`$coefs
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03686327 0.04218032 0.8739449 3.843069e-01
x1 0.13359822 0.04037758 3.3087228 1.316050e-03
x2 0.36019362 0.02384050 15.1084733 3.143002e-27
$`x1 + x2`$RS
[1] 0.8384476
$`x1 + x3`
$`x1 + x3`$coefs
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03390277 0.02660885 1.274116 2.056664e-01
x1 0.04295226 0.02654442 1.618128 1.088823e-01
x3 0.28556167 0.01064096 26.836090 1.110231e-46
$`x1 + x3`$RS
[1] 0.9356962
$`x2 + x3`
$`x2 + x3`$coefs
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0291651 0.02391629 1.219466 2.256244e-01
x2 0.1116096 0.02205835 5.059746 1.989407e-06
x3 0.2304448 0.01497633 15.387271 8.944792e-28
$`x2 + x3`$RS
[1] 0.9477506
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.