簡體   English   中英

R循環:多個線性回歸模型(一次不包含1個變量)

[英]R Loop: Multiple linear regression models (exclude 1 variable at a time)

如何創建一個自動運行多個線性回歸模型的循環? 我有一個包含12個獨立變量的完整模型。 我想創建其他模型,一次不包含1個自變量。

請參見以下示例:

 #round 1 full model
  formula <- Bound_Count~Days_diff_Eff_Subm_2 +
  TR_BS_BROKER_ID_360_2 + TR_BS_BROKER_ID_360_M +
  RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
  TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
  PIP_Flag + TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
  Resolved_Conflict + Priority_2

    # split train and test
    dataL_TT  <- dataL[dataL$DataSplit_Ind=="Modeling",]
    dataL_V <- dataL[dataL$DataSplit_Ind=="Validation",]
    # bind to submit model
    modelTT <- glm(formula
                   ,family=binomial(link = "logit"), data=dataL_TT)
    modelTT$aic

    # round 2 exclude TR_BS_BROKER_ID_360_M
    formula2 <- Bound_Count~Days_diff_Eff_Subm_2 +
      TR_BS_BROKER_ID_360_2 +
      RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
      TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
      PIP_Flag +
      TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
      Resolved_Conflict +
      Priority_2
    modelTT2 <- glm(formula2 , family=binomial(link = "logit"), data=dataL_TT)
    modelTT2$aic

    # round 3 exclude Days_diff_Eff_Subm_2
    formula3 <- Bound_Count~TR_BS_BROKER_ID_360_2 + TR_BS_BROKER_ID_360_M +
      RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
      TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
      PIP_Flag +
      TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
      Resolved_Conflict +
      Priority_2
    modelTT3 <- glm(formula3 , family=binomial(link = "logit"), data=dataL_TT)
    modelTT3$aic

    # round 4 exclude TR_BS_BROKER_ID_360_2
    formula4 <- Bound_Count~Days_diff_Eff_Subm_2 + TR_BS_BROKER_ID_360_M +
      RURALPOP_P_CWR_2 + RURALPOP_P_CWR_M +
      TR_B_BROKER_ID_360 + TR_SCW_BROKER_ID_360 +
      PIP_Flag +
      TR_BS_BROKER_INDIVIDUAL_720_2 + TR_BS_BROKER_INDIVIDUAL_720_M +
      Resolved_Conflict +
      Priority_2
    modelTT4 <- glm(formula4 , family=binomial(link = "logit"), data=dataL_TT)
    modelTT4$aic

依此類推。基本上,我需要有12個模型,一次要排除1個不同的自變量。

這是一個主意:

d <- data.frame(y = 1, x1 = 2, x2 = 3, x3 = 4)
allFeatures <- names(d)[-1] # exclude y
# container for models
listOfModels <- vector("list", length(allFeatures))
# loop over features
for (i in seq_along(allFeatures)) {
  # exclude feature i
  currentFeatures <- allFeatures[-i]
  # programmatically assemble regression formula
  regressionFormula <- as.formula(
     paste("y ~ ", paste(currentFeatures, collapse="+")))
  # fit model
  currentModel <- lm(formula = regressionFormula, data = d)
  # store model in container
  listOfModels[[i]] <- currentModel
} 

然后,您可以使用標准列表語法從listOfModels檢索模型,即listOfModels[[1]]返回不帶x1模型,依此類推。

編輯

我不確定為什么要對直方圖的數據進行排序,但是在這里:

vectorOfAICs <- vapply(listOfModels, function(x) AIC(x), numeric(1))
sortedAICs <- vectorOfAICs[order(vectorOfAICs)]
hist(sortedAICs)

評論中的答案很明顯,有兩個警告:

1)從擬合的LM模型獲取AIC,調用為AIC(modelObject)

2) lapply()會給您返回一個列表,如果您的目標是繪制數據,則可能不需要。 最好使用sapply()vapply()來返回數值向量,該向量可以更容易地排序和繪制。

fullmodel#NO VARIABLES REMOVED
vars=c(variables to be remove one at a time here)#Put all the variables in a vector
Map(function(x)update(fullmodel,paste0(".~.-",x),data=datahere),vars)#

Map循環一次從創建的完整模型中刪除每個變量。 例如,使用update(lm(mtcars),.~.-cyl,data=mtcars)將從lm函數中刪除該cyl,即更新先前創建的lm object 當然,您可以使用add1drop1甚至drop.terms

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM