[英]How to send a confusion matrix to caret's confusionMatrix?
I'm looking at this data set: https://archive.ics.uci.edu/ml/datasets/Credit+Approval . 我正在查看以下数据集: https : //archive.ics.uci.edu/ml/datasets/Credit+Approval 。 I built a ctree:
我建立了一个ctree:
myFormula<-class~. # class is a factor of "+" or "-"
ct <- ctree(myFormula, data = train)
And now I'd like to put that data into caret's confusionMatrix method to get all the stats associated with the confusion matrix: 现在,我想将该数据放入插入符号的confusionMatrix方法中,以获取与混淆矩阵关联的所有统计信息:
testPred <- predict(ct, newdata = test)
#### This is where I'm doing something wrong ####
confusionMatrix(table(testPred, test$class),positive="+")
#### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ####
$positive
[1] "+"
$table
td
testPred - +
- 99 6
+ 20 88
$overall
Accuracy Kappa AccuracyLower AccuracyUpper AccuracyNull AccuracyPValue McnemarPValue
8.779343e-01 7.562715e-01 8.262795e-01 9.186911e-01 5.586854e-01 6.426168e-24 1.078745e-02
$byClass
Sensitivity Specificity Pos Pred Value Neg Pred Value Precision Recall F1
0.9361702 0.8319328 0.8148148 0.9428571 0.8148148 0.9361702 0.8712871
Prevalence Detection Rate Detection Prevalence Balanced Accuracy
0.4413146 0.4131455 0.5070423 0.8840515
$mode
[1] "sens_spec"
$dots
list()
attr(,"class")
[1] "confusionMatrix"
So Sensetivity is: 所以敏感性是:
(from caret's confusionMatrix doc)
(来自插入符号的confusionMatrix文档)
If you take my confusion matrix: 如果您采用我的混淆矩阵:
$table
td
testPred - +
- 99 6
+ 20 88
You can see this doesn't add up: Sensetivity = 99/(99+20) = 99/119 = 0.831928
. 您可以看到这并没有加在一起:
Sensetivity = 99/(99+20) = 99/119 = 0.831928
。 In my confusionMatrix results, that value is for Specificity. 在我的confusionMatrix结果中,该值是针对特异性的。 However Specificity is
Specificity = D/(B+D) = 88/(88+6) = 88/94 = 0.9361702
, the value for Sensitivity. 但是,特异性是
Specificity = D/(B+D) = 88/(88+6) = 88/94 = 0.9361702
,即灵敏度值。
I've tried this confusionMatrix(td,testPred, positive="+")
but got even weirder results. 我已经尝试过这个
confusionMatrix(td,testPred, positive="+")
但结果甚至更奇怪。 What am I doing wrong? 我究竟做错了什么?
UPDATE: I also realized that my confusion matrix is different than what caret thought it was: 更新:我还意识到我的困惑矩阵与插入符号认为的有所不同:
Mine: Caret:
td testPred
testPred - + td - +
- 99 6 - 99 20
+ 20 88 + 6 88
As you can see, it thinks my False Positive and False Negative are backwards. 如您所见,它认为我的误报和误报是倒退的。
UPDATE : I found it's a lot better to send the data, rather than a table as a parameter. 更新 :我发现发送数据要比发送表作为参数要好得多。 From the confusionMatrix docs:
从confusionMatrix文档中:
reference
参考
a factor of classes to be used as the true results用作真实结果的类别因素
I took this to mean what symbol constitutes a positive outcome . 我的意思是什么符号构成积极的结果 。 In my case, this would have been a
+
. 就我而言,这应该是
+
。 However, 'reference' refers to the actual outcomes from the data set, aka the dependent variable. 但是,“参考”是指数据集(也称为因变量)的实际结果。
So I should have used confusionMatrix(testPred, test$class)
. 所以我应该使用
confusionMatrix(testPred, test$class)
。 If your data is out of order for some reason, it will shift it into the correct order (so the positive and negative outcomes/predictions align correctly in the confusion matrix. 如果您的数据由于某种原因出现乱序,则会将其转变为正确的顺序(因此,正面和负面结果/预测在混淆矩阵中正确对齐)。
However, if you are worried about the outcome being the correct factor, install the plyr
library, and use revalue
to change the factor: 但是,如果您担心结果是正确的因子,请安装
plyr
库,然后使用重revalue
来更改因子:
install.packages("plyr")
library(plyr)
newDF <- df
newDF$class <- revalue(newDF$class,c("+"=1,"-"=0))
# You'd have to rerun your model using newDF
I'm not sure why this worked, but I just removed the positive parameter: 我不确定为什么这样做,但是我只是删除了正参数:
confusionMatrix(table(testPred, test$class))
My Confusion Matrix: 我的混淆矩阵:
td
testPred - +
- 99 6
+ 20 88
Caret's Confusion Matrix: 插入符号的混淆矩阵:
td
testPred - +
- 99 6
+ 20 88
Although now it says $positive: "-"
so I'm not sure if that's good or bad. 尽管现在它说
$positive: "-"
所以我不确定这是好是坏。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.