简体   繁体   中英

R Loop: Multiple linear regression models (exclude 1 variable at a time)

How do you create a loop that automates running several linear regression models? I have a full model with 12 independent variables. I want to create other models that exclude 1 independent variable at a time.

Please see the example below:

 #round 1 full model
  formula <- Bound_Count~Days_diff_Eff_Subm_2 +
  TR_BS_BROKER_ID_360_2 + TR_BS_BROKER_ID_360_M +
  RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
  TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
  PIP_Flag + TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
  Resolved_Conflict + Priority_2

    # split train and test
    dataL_TT  <- dataL[dataL$DataSplit_Ind=="Modeling",]
    dataL_V <- dataL[dataL$DataSplit_Ind=="Validation",]
    # bind to submit model
    modelTT <- glm(formula
                   ,family=binomial(link = "logit"), data=dataL_TT)
    modelTT$aic

    # round 2 exclude TR_BS_BROKER_ID_360_M
    formula2 <- Bound_Count~Days_diff_Eff_Subm_2 +
      TR_BS_BROKER_ID_360_2 +
      RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
      TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
      PIP_Flag +
      TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
      Resolved_Conflict +
      Priority_2
    modelTT2 <- glm(formula2 , family=binomial(link = "logit"), data=dataL_TT)
    modelTT2$aic

    # round 3 exclude Days_diff_Eff_Subm_2
    formula3 <- Bound_Count~TR_BS_BROKER_ID_360_2 + TR_BS_BROKER_ID_360_M +
      RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
      TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
      PIP_Flag +
      TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
      Resolved_Conflict +
      Priority_2
    modelTT3 <- glm(formula3 , family=binomial(link = "logit"), data=dataL_TT)
    modelTT3$aic

    # round 4 exclude TR_BS_BROKER_ID_360_2
    formula4 <- Bound_Count~Days_diff_Eff_Subm_2 + TR_BS_BROKER_ID_360_M +
      RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
      TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
      PIP_Flag +
      TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
      Resolved_Conflict +
      Priority_2
    modelTT4 <- glm(formula4 , family=binomial(link = "logit"), data=dataL_TT)
    modelTT4$aic

And so on.. Basically I need to have 12 models that exclude 1 distinct independent variable at a time.

Here is an idea:

d <- data.frame(y = 1, x1 = 2, x2 = 3, x3 = 4)
allFeatures <- names(d)[-1] # exclude y
# container for models
listOfModels <- vector("list", length(allFeatures))
# loop over features
for (i in seq_along(allFeatures)) {
  # exclude feature i
  currentFeatures <- allFeatures[-i]
  # programmatically assemble regression formula
  regressionFormula <- as.formula(
     paste("y ~ ", paste(currentFeatures, collapse="+")))
  # fit model
  currentModel <- lm(formula = regressionFormula, data = d)
  # store model in container
  listOfModels[[i]] <- currentModel
} 

Then you just retrieve models from listOfModels with the standard list syntax, ie listOfModels[[1]] returns model without x1 , and so on.

EDIT

I am not sure why you would want to sort the data for a histogram, but here:

vectorOfAICs <- vapply(listOfModels, function(x) AIC(x), numeric(1))
sortedAICs <- vectorOfAICs[order(vectorOfAICs)]
hist(sortedAICs)

The answer in the comment is pretty much spot on, with two caveats:

1) to get an AIC from a fitted LM model, the call is AIC(modelObject) .

2) lapply() will give you back a list, which you probably don't want if your goal is to plot the data. Better use sapply() or vapply() to get back a numeric vector, which can be sorted and plotted easier.

fullmodel#NO VARIABLES REMOVED
vars=c(variables to be remove one at a time here)#Put all the variables in a vector
Map(function(x)update(fullmodel,paste0(".~.-",x),data=datahere),vars)#

The Map loops over Removing each and every variable at a time from the full model that was created. using update(lm(mtcars),.~.-cyl,data=mtcars) for example will remove the cyl from the lm function ie update the lm object which had earlier been created. of course you can use add1 , drop1 and even drop.terms

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM