[英]Loop for multiple linear regression
Hi I'm starting to use r and am stuck on analyzing my data.嗨,我开始使用 r 并坚持分析我的数据。 I have a dataframe that has 80 columns.我有一个有 80 列的 dataframe。 Column 1 is the dependent variable and from column 2 to 80 they are the independent variables.第 1 列是因变量,第 2 列到第 80 列是自变量。 I want to perform 78 multiple linear regressions leaving the first independent variable of the model fixed (column 2) and create a list where I can to save all regressions to later be able to compare the models using AIC scores.我想执行 78 个多元线性回归,将 model 的第一个自变量固定(第 2 列)并创建一个列表,我可以在其中保存所有回归,以便以后能够使用 AIC 分数比较模型。 how can i do it?我该怎么做?
Here is my loop这是我的循环
data.frame
for(i in 2:80)
{
Regressions <- lm(data.frame$column1 ~ data.frame$column2 + data.frame [,i])
}
Using the iris
dataset as an example you can do:以iris
数据集为例,您可以执行以下操作:
lapply(seq_along(iris)[-c(1:2)], function(x) lm(data = iris[,c(1:2, x)]))
[[1]]
Call:
lm(data = iris[, c(1:2, x)])
Coefficients:
(Intercept) Sepal.Width Petal.Length
2.2491 0.5955 0.4719
[[2]]
Call:
lm(data = iris[, c(1:2, x)])
Coefficients:
(Intercept) Sepal.Width Petal.Width
3.4573 0.3991 0.9721
[[3]]
Call:
lm(data = iris[, c(1:2, x)])
Coefficients:
(Intercept) Sepal.Width Speciesversicolor Speciesvirginica
2.2514 0.8036 1.4587 1.9468
This works because when you pass a dataframe to lm()
without a formula it applies the function DF2formula()
under the hood which treats the first column as the response and all other columns as predictors.这是有效的,因为当您在没有公式的情况下将 dataframe 传递给lm()
时,它会在引擎盖下应用 function DF2formula()
,它将第一列视为响应,将所有其他列视为预测变量。
With the for
loop we can initialize a list
to store the output使用for
循环,我们可以初始化一个list
来存储 output
nm1 <- names(df1)[2:80]
Regressions <- vector('list', length(nm1))
for(i in seq_along(Regressions)) {
Regressions[[i]] <- lm(reformulate(c("column2", nm1[i]), "column1"), data = df1)
}
Or use paste
instead of reformulate
或使用paste
而不是reformulate
for(i in seq_along(Regressions)) {
Regressions[[i]] <- lm(as.formula(paste0("column1 ~ column2 + ",
nm1[i])), data = df1)
}
Using a reproducible example使用可重现的示例
nm2 <- names(iris)[3:5]
Regressions2 <- vector('list', length(nm2))
for(i in seq_along(Regressions2)) {
Regressions2[[i]] <- lm(reformulate(c("Sepal.Width", nm2[i]), "Sepal.Length"), data = iris)
}
Regressions2[[1]]
#Call:
#lm(formula = reformulate(c("Sepal.Width", nm2[i]), "Sepal.Length"),
# data = iris)
#Coefficients:
# (Intercept) Sepal.Width Petal.Length
# 2.2491 0.5955 0.4719
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.