[英]R caret inconsistent results in model tuning
Using the caret package for model tuning today I have faced this strange behavior: given a specific combination of tuning parameters T*, the metric (ie Cohen's K) value associated with T* changes if T* is evaluated alone or as part of a grid of possible combinations. 今天,使用插入符号包进行模型调整时,我遇到了一个奇怪的问题:给定调整参数T *的特定组合,如果单独评估T *或将其作为网格的一部分,则与T *相关的度量(即Cohen K)值就会改变可能的组合。 In the practical example which follows caret is used to interface with the gbm package.
在下面的实际示例中,将插入符号用于与gbm软件包进行交互。
# Load libraries and data
library (caret)
data<-read.csv("mydata.csv")
data$target<-as.factor(data$target)
# data are available at https://www.dropbox.com/s/1bglmqd14g840j1/mydata.csv?dl=0
Pocedure 1: T* evaluated alone 方法1:单独评估T *
#Define 5-fold cv as validation settings
fitControl <- trainControl(method = "cv",number = 5)
# Define the combination of tuning parameter for this example T*
gbmGrid <- expand.grid(.interaction.depth = 1,
.n.trees = 1000,
.shrinkage = 0.1, .n.minobsinnode=1)
# Fit a gbm with T* as model parameters and K as scoring metric.
set.seed(825)
gbmFit1 <- train(target ~ ., data = data,
method = "gbm",
distribution="adaboost",
trControl = fitControl,
tuneGrid=gbmGrid,
verbose=F,
metric="Kappa")
# The results show that T* is associated with Kappa = 0.47. Remember this result and the confusion matrix.
testPred<-predict(gbmFit1, newdata = data)
confusionMatrix(testPred, data$target)
# output selection
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 832 34
1 0 16
Kappa : 0.4703
Procedure 2: T* evaluated along with other tuning profiles 步骤2:连同其他调整配置文件一起评估T *
Here everything is the same as in procedure 1 except for the fact that several combinations of tuning parameters {T} are considered: 除了考虑调整参数{T}的几种组合外,这里的一切与过程1相同:
# Notice that the original T* is included in {T}!!
gbmGrid2 <- expand.grid(.interaction.depth = 1,
.n.trees = seq(100,1000,by=100),
.shrinkage = 0.1, .n.minobsinnode=1)
# Fit the gbm
set.seed(825)
gbmFit2 <- train(target ~ ., data = data,
method = "gbm",
distribution="adaboost",
trControl = fitControl,
tuneGrid=gbmGrid2,
verbose=F,
metric="Kappa")
# Caret should pick the model with the highest Kappa.
# Since T* is in {T} I would expect the best model to have K >= 0.47
testPred<-predict(gbmFit2, newdata = data)
confusionMatrix(testPred, data$target)
# output selection
Reference
Prediction 0 1
0 831 47
1 1 3
Kappa : 0.1036
The results are inconsistent with my expectations: the best model in {T} scores K=0.10. 结果与我的预期不一致:{T}中的最佳模型得分K = 0.10。 How is it possible given that T* has K = 0.47 and it is included in {T}?
假设T *的K = 0.47并包含在{T}中,怎么可能? Additionally, according to the following plot , K for T* as evaluated in procedure 2 is now around 0.01.
此外,根据下图,在步骤2中评估的T *的K现在约为0.01。 Any idea about what is going on?
关于发生了什么的任何想法? Am I missing something?
我想念什么吗?
I am getting consistent resampling results from your data and code. 我从您的数据和代码中获得了一致的重采样结果。
The first model has Kappa = 0.00943
第一个模型的
Kappa = 0.00943
gbmFit1$results
interaction.depth n.trees shrinkage n.minobsinnode Accuracy Kappa AccuracySD
1 1 1000 0.1 1 0.9331022 0.009430576 0.004819004
KappaSD
1 0.0589132
The second model has the same results for n.trees = 1000
对于
n.trees = 1000
,第二个模型具有相同的结果
gbmFit2$results
shrinkage interaction.depth n.minobsinnode n.trees Accuracy Kappa AccuracySD
1 0.1 1 1 100 0.9421803 -0.002075765 0.002422952
2 0.1 1 1 200 0.9387776 -0.008326896 0.002468351
3 0.1 1 1 300 0.9365049 -0.012187900 0.002625886
4 0.1 1 1 400 0.9353749 -0.013950906 0.003077431
5 0.1 1 1 500 0.9353685 -0.013961221 0.003244201
6 0.1 1 1 600 0.9342322 -0.015486214 0.005202656
7 0.1 1 1 700 0.9319658 -0.018574633 0.007033402
8 0.1 1 1 800 0.9319658 -0.018574633 0.007033402
9 0.1 1 1 900 0.9342386 0.010955568 0.003144850
10 0.1 1 1 1000 0.9331022 0.009430576 0.004819004
KappaSD
1 0.004641553
2 0.004654972
3 0.003978702
4 0.004837097
5 0.004878259
6 0.007469843
7 0.009470466
8 0.009470466
9 0.057825336
10 0.058913202
Note that the best model in your second run has n.trees = 900
请注意,第二次运行的最佳模型的
n.trees = 900
gbmFit2$bestTune
n.trees interaction.depth shrinkage n.minobsinnode
9 900 1 0.1 1
Since train
picks the "best" model based on your metric, your second prediction is using a different model (n.trees of 900 instead of 1000). 由于
train
根据您的指标选择了“最佳”模型,因此您的第二个预测是使用其他模型(n.trees为900而不是1000)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.