简体   繁体   English

R插入符:调整GLM提升修剪参数

[英]R caret: Tuning GLM boost prune parameter

I'm trying to tune the parameters for a GLM boost model. 我正在尝试调整GLM增强模型的参数。 According to the Caret package documentation concerning this model there is 2 parameters that can be adjusted, mstop and prune. 根据有关此型号的Caret包文档 ,有2个参数可以调整,mstop和prune。

    library(caret)
    library(mlbench)

    data(Sonar)

    set.seed(25)
    trainIndex = createDataPartition(Sonar$Class, p = 0.9, list = FALSE)
    training = Sonar[ trainIndex,]
    testing  = Sonar[-trainIndex,]

    ### set training parameters
    fitControl = trainControl(method = "repeatedcv",
                              number = 10,
                              repeats = 10,
                              ## Estimate class probabilities
                              classProbs = TRUE,
                              ## Evaluate a two-class performances  
                              ## (ROC, sensitivity, specificity) using the following function 
                              summaryFunction = twoClassSummary)

    ### train the models
    set.seed(69)
    # Use the expand.grid to specify the search space   
    glmBoostGrid = expand.grid(mstop = c(50, 100, 150, 200, 250, 300),
                               prune = c('yes', 'no'))

    glmBoostFit = train(Class ~ ., 
                        data = training,
                        method = "glmboost",
                        trControl = fitControl,
                        tuneGrid = glmBoostGrid,
                        metric = 'ROC')
glmBoostFit

The output is the following: 输出如下:

Boosted Generalized Linear Model 

188 samples
 60 predictors
  2 classes: 'M', 'R' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times) 
Summary of sample sizes: 169, 169, 169, 169, 170, 169, ... 
Resampling results across tuning parameters:

  mstop  ROC        Sens   Spec       ROC SD      Sens SD    Spec SD  
   50    0.8261806  0.764  0.7598611  0.10208114  0.1311104  0.1539477
  100    0.8265972  0.729  0.7625000  0.09459835  0.1391250  0.1385465
  150    0.8282083  0.717  0.7726389  0.09570417  0.1418152  0.1382405
  200    0.8307917  0.714  0.7769444  0.09484042  0.1439011  0.1452857
  250    0.8306667  0.719  0.7756944  0.09452604  0.1436740  0.1535578
  300    0.8278403  0.728  0.7722222  0.09794868  0.1425398  0.1576030

Tuning parameter 'prune' was held constant at a value of yes
ROC was used to select the optimal model using  the largest value.
The final values used for the model were mstop = 200 and prune = yes. 

The prune parameter is kept constant ( Tuning parameter 'prune' was held constant at a value of yes ) although the glmBoostGrid contains also prune == no . prune参数保持不变( Tuning parameter 'prune' was held constant at a value of yes ),尽管glmBoostGrid也包含prune == no I took a look at the mboost package documentation at the boost_control method and only the mstop parameter is accessible, so how can the prune parameter be tuned with the tuneGrid parameter of the train method? 我在boost_control方法中查看了mboost包文档,只能访问mstop参数,那么如何使用train方法的tuneGrid参数调整prune参数?

The difference is loceted in this part of the calls for glmboost: 不同之处在于glmboost的这部分调用:

if (param$prune == "yes") {
    out <- if (is.factor(y)) 
        out[mstop(AIC(out, "classical"))]
    else out[mstop(AIC(out))]
}

The difference lies in how the aic is calculated. 不同之处在于如何计算aic。 But running diverse tests with glmboost in caret I have my doubts if it is behaving as expected. 但是在插入符号中使用glmboost运行各种测试我怀疑它是否表现得如预期的那样。 I have created an issue in github to see if my suspicions are correct. 我在github中创建了一个问题,看看我的怀疑是否正确。 I'll edit my answer if there is more information from the developers. 如果开发人员提供更多信息,我会编辑我的答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM