[英]In R caret, obtain in-sample and out-of sample probability estimates
I have some data similar to: 我有一些类似的数据:
data(Titanic) # need one row per passenger
df <- data.frame(Titanic, stringsAsFactors=TRUE)
df <- df[rep(seq_len(nrow(df)), df[,"Freq"]), which(names(df)!="Freq")]
I trained a model in caret
using repeated cross-validated logistic regression, like: 我使用重复的交叉验证逻辑回归在
caret
训练了一个模型,例如:
library(caret)
tc <- trainControl(method="repeatedcv", number=10, repeats=3,
returnData=TRUE, savePredictions=TRUE, classProbs=TRUE)
glmFit <- train(Survived ~ Class + Sex + Age, data = df, weights=Freq,
method="glm", family="binomial",
trControl = tc)
summary(glmFit)
I would like to obtain the average in-sample fitted probability and out-of-sample predicted probability (averages of 27 and of 3 values for each row in the data frame, respectively, in this case since it's 10-fold CV x 3 repeats). 我想获得平均样本内拟合概率和样本外预测概率(在这种情况下,数据帧中每行的平均值分别为27和3个值,因为它是10倍CV x 3重复)。
I would like to append each row's average in-sample and out-of-sample probability estimates onto the data frame -- to look like the last two columns of: 我想将每一行的平均样本内和样本外概率估计值附加到数据帧上-看起来像以下两列:
>df_appended
| Class | Sex | Age | Survived | training_p_surv_est | testing_p_surv_est |
3rd M Child 0 .251 .259
3rd M Child 1 .251 .259
2nd M Child 1 .324 .319
2nd M Child 0 .324 .319
According to ?trainControl
, I have saved the holdout predictions for each resample with savePredictions=TRUE
. 根据
?trainControl
,我已经使用savePredictions=TRUE
保存了每次重采样的保持预测。 (And classProbs=TRUE
, since I want raw probabilities, not classes.) (并且
classProbs=TRUE
,因为我需要原始概率,而不是类。)
How do I access the in-sample and out-of-sample predictions? 如何访问样本内和样本外预测? Looking at
?predict.train
, I have tried using 看着
?predict.train
,我尝试使用
extractProb(list(glmFit))
#Error in eval(expr, envir, enclos) : object 'Class2nd' not found
Many thanks. 非常感谢。
If you take a look at your glmFit object. 如果您看一下您的glmFit对象。 It contains a sublist named 'pred'.
它包含一个名为“ pred”的子列表。
head(glmFit$pred)
You will get the predicted probability as well as predicted class for each cv and fold. 您将获得每个简历和弃牌的预测概率以及预测类别。
cheers. 干杯。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.