I'm trying to apply stacking on my dataset but here I am.
# Load library
library(DJL)
library(caret)
library(caretEnsemble)
# Load data and format the target attribute to avoid clutters
df <- dataset.engine.2015[, -c(1, 2)]
levels(df$Type) <- list(NA.D = "NA-D", NA.P = "NA-P", SC.P = "SC-P", TC.D = "TC-D", TC.P = "TC-P")
# Run
st.methods <- c("lda", "rpart", "glm", "knn", "svmRadial")
st.control <- trainControl(method = "repeatedcv", number = 5, repeats = 3,
savePredictions = T, classProbs = T)
st.models <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
Then I get this:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error: Stopping
In addition: There were 18 warnings (use warnings() to see them)
Can anyone help me to fix this error?
The glm
model cannot be used for predicting categorical dependent variables with more than two categories. Try to delete glm
from st.methods
or substitute glm
with, for example, multinom
, gbm
, randomForest
.
Here are two useful experiment. In the first we consider only glm
:
rm(list=ls())
library(DJL)
library(caret)
library(caretEnsemble)
df <- dataset.engine.2015[, -c(1, 2)]
levels(df$Type) <- list(NA.D = "NA-D", NA.P = "NA-P", SC.P = "SC-P", TC.D = "TC-D", TC.P = "TC-P")
st.control <- trainControl(method = "repeatedcv", number = 5, repeats = 3,
savePredictions = T, classProbs = T)
st.methods <- c("glm")
st.models <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
Here is the error message:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error in train.default(x, y, weights = w, ...) : Stopping
Inoltre: There were 18 warnings (use warnings() to see them)
Now we substitute glm
with multinom
:
st.methods <- c("multinom")
st.models <- caretList(Type ~., data = df, trControl = st.control, methodList = st.methods)
print(st.models)
The output is:
$multinom
Penalized Multinomial Regression
1206 samples
5 predictor
5 classes: 'NA.D', 'NA.P', 'SC.P', 'TC.D', 'TC.P'
No pre-processing
Resampling: Cross-Validated (5 fold, repeated 3 times)
Summary of sample sizes: 964, 965, 965, 965, 965, 964, ...
Resampling results across tuning parameters:
decay Accuracy Kappa
0e+00 0.9306411 0.8518294
1e-04 0.9300901 0.8506964
1e-01 0.9328507 0.8564466
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was decay = 0.1.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.