[英]How to create a loop that will add new variables to a pre define glm model
I would like to create a procedure that will add per each loop a new variable (from a pool of variables) to a glm model that allready contains few of the variables that need to be part of the final model.I than would like to have the results of the loop in a list that will contain the glm formula and results.I know how to do it manually (code is written below) but I would be happy to know how to do it automaticaly. 我想创建一个过程,该过程将为每个循环向一个glm模型添加一个新变量(来自变量池),该模型已经准备好包含最终模型的一部分变量了。清单中的循环结果将包含glm公式和结果。我知道如何手动执行(下面编写了代码),但我很高兴知道如何自动执行。 Here is a toy dataset and the relevant code to do the task manually:
这是一个玩具数据集和用于手动执行任务的相关代码:
dat <- read.table(text = "target birds wolfs Country
0 21 7 a
0 8 4 b
1 2 5 c
1 2 4 a
0 8 3 a
1 1 12 a
1 7 10 b
1 1 9 c",header = TRUE)
#birds is a mandatory variable so I'll need to add one of the other variables in addition to birds
glm<-glm(target~birds,data=dat)
dat$glm_predict_response <- ifelse(predict(glm,newdata=dat, type="response")>.5, 1, 0)
xtabs(~target + glm_predict_response, data = dat)
glm_predict_response
target 0 1
0 1 2
1 0 5
glm_predict_response
prop.table(xtabs(~target + glm_predict_response, data = dat), 2)
target 0 1
0 1.0000000 0.2857143
1 0.0000000 0.7142857
#manually I would add the next variable (wolfs) to the model and look at the results:
glm<-glm(target~birds+wolfs,data=dat)
dat$glm_predict_response <- ifelse(predict(glm,newdata=dat, type="response")>.5, 1, 0)
xtabs(~target + glm_predict_response, data = dat)
glm_predict_response
target 0 1
0 3 0
1 0 5
prop.table(xtabs(~target + glm_predict_response, data = dat), 2)
glm_predict_response
target 0 1
0 1 0
1 0 1
In the next loop I would add the variable "country" and do the same procedure, In the real life I have hundreds of variables so turning it to an automatic proccess would be great. 在下一个循环中,我将添加变量“ country”并执行相同的过程。在现实生活中,我有数百个变量,因此将其转换为自动过程会很棒。
I would do it using update
to update the formula each time in the loop: 我会在循环中每次使用
update
来更新公式:
#initiate formula
myform <- target~1
for ( i in c('birds', 'wolfs' , 'Country')) {
#update formula each time in the loop with the above variables
#this line below is practically the only thing I changed
myform <- update(myform, as.formula(paste('~ . +', i)))
glm<-glm(myform,data=dat)
dat$glm_predict_response <- ifelse(predict(glm,newdata=dat, type="response")>.5, 1, 0)
print(myform)
print(xtabs(~ target + glm_predict_response, data = dat))
print(prop.table(xtabs(~target + glm_predict_response, data = dat), 2))
}
Output: 输出:
target ~ birds
glm_predict_response
target 0 1
0 1 2
1 0 5
glm_predict_response
target 0 1
0 1.0000000 0.2857143
1 0.0000000 0.7142857
target ~ birds + wolfs
glm_predict_response
target 0 1
0 3 0
1 0 5
glm_predict_response
target 0 1
0 1 0
1 0 1
target ~ birds + wolfs + Country
glm_predict_response
target 0 1
0 3 0
1 0 5
glm_predict_response
target 0 1
0 1 0
1 0 1
You can try something like 您可以尝试类似
list_1=list(NA)
list_2=list(NA)
for (i in 2 :ncol(dat)){
dat1=dat[,1:i]
glm<-glm(target~.,data=dat1)
dat1$glm_predict_response <- ifelse(predict(glm,newdata=dat1, type="response")>.5, 1, 0)
list_1[[i-1]]=xtabs(~target + glm_predict_response, data = dat1)
names(list_1)[i-1]=do.call(paste,as.list(colnames(dat1)[c(-1,-ncol(dat1))]))
list_2[[i-1]]=prop.table(xtabs(~target + glm_predict_response, data = dat1), 2)
names(list_2)[i-1]=do.call(paste,as.list(colnames(dat1)[c(-1,-ncol(dat1))]))}
But you need to have col in right order. 但是您需要以正确的顺序排列col。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.