[英]R studio - I need the confidence intervals of sensitivity and specificity and positive and negative predictive values using confusion matrix
I am writing a paper about the validity of a billing code in hospitalized children.我正在写一篇关于住院儿童账单代码有效性的论文。 I am a very novice R studio user.
我是一个非常新手的 R 工作室用户。 I need the confidence intervals for the sensitive and specificity and positive and negative predictive values but I can't figure out how to do it.
我需要敏感度和特异性以及阳性和阴性预测值的置信区间,但我不知道该怎么做。
My data has 3 columns: ID, true value, billing value
我的数据有 3 列:
ID, true value, billing value
Here is my code:这是我的代码:
confusionMatrix(table(finalcodedataset$billing_value, finalcodedataset$true_value),
positive="1", boot=TRUE, boot_samples=4669, alpha=0.05)
here is the output:这是 output:
Confusion Matrix and Statistics混淆矩阵和统计
0 1
0 4477 162
1 10 20
Accuracy : 0.9632
95% CI : (0.9574, 0.9684)
No Information Rate : 0.961
P-Value [Acc > NIR] : 0.238
Kappa : 0.1796
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.109890
Specificity : 0.997771
Pos Pred Value : 0.666667
Neg Pred Value : 0.965079
Prevalence : 0.038981
Detection Rate : 0.004284
Detection Prevalence : 0.006425
Balanced Accuracy : 0.553831
'Positive' Class : 1
Caret and other packages use the Clopper-Pearson Interval method to calculate the confidence interval. Caret 和其他包使用Clopper-Pearson Interval方法来计算置信区间。
I consider your 2x2 reversed since the TP (True Positive) is on the bottom right.我认为你的 2x2 反转了,因为 TP(真阳性)在右下角。 If the TP is at the top left then variables (A,B,C,D) would be switched.
如果 TP 在左上角,则变量 (A,B,C,D) 将被切换。
D = 4477
C = 162
B = 10
A = 20
Acc = (A+D)/(A+B+C+D)
Sensitivity = A / (A + C)
Specificity = D / (D + B)
P = (A+C)/(A+B+C+D)
PPV = (Sensitivity*P)/((Sensitivity*P)+((1-Specificity)*(1-P)))
NPV = (Specificity*(1-P))/(((1 - Sensitivity)*P)+((Specificity)*(1-P)))
n = A+B+C+D
x = n - (A+D)
alpha = 0.05
ub = 1 - ((1 + (n - x + 1)/ (x * qf(alpha *.5, 2*x, 2*(n - x + 1))))^-1)
lb = 1 - ((1 + (n - x) / ((x + 1)* qf(1-(alpha*.5), 2*(x+1), 2*(n-x))))^-1)
CI = c(lb,ub)
> Acc
[1] 0.9631613
> CI
[1] 0.9573536 0.9683800
> Sensitivity
[1] 0.1098901
> Specificity
[1] 0.9977713
> PPV
[1] 0.6666667
> NPV
[1] 0.9650787
Here is also a good resource for where these formulas come from.对于这些公式的来源, 这里也是一个很好的资源。
You can use epiR package for this purpouse.您可以为此目的使用 EpiR package。
Example:例子:
library(epiR)
data <- as.table(matrix(c(670,202,74,640), nrow = 2, byrow = TRUE))
rval <- epi.tests(data, conf.level = 0.95)
print(rval)
Outcome + Outcome - Total
Test + 670 202 872
Test - 74 640 714
Total 744 842 1586
Point estimates and 95 % CIs:
---------------------------------------------------------
Apparent prevalence 0.55 (0.52, 0.57)
True prevalence 0.47 (0.44, 0.49)
Sensitivity 0.90 (0.88, 0.92)
Specificity 0.76 (0.73, 0.79)
Positive predictive value 0.77 (0.74, 0.80)
Negative predictive value 0.90 (0.87, 0.92)
Positive likelihood ratio 3.75 (3.32, 4.24)
Negative likelihood ratio 0.13 (0.11, 0.16)
---------------------------------------------------------
The following reproducible example is partially inspired from ROC curve from training data in caret .以下可重现的示例的部分灵感来自于 caret 中训练数据的 ROC 曲线。
library(MLeval)
library(caret)
library(pROC)
data(Sonar)
ctrl <- trainControl(method = "cv", summaryFunction = twoClassSummary, classProbs = TRUE, savePredictions = TRUE)
set.seed(42)
fit1 <- train(Class ~ ., data = Sonar,method = "rf",trControl = ctrl)
bestmodel <- merge(fit1$bestTune, fit1$pred)
mtx <- confusionMatrix(table(bestmodel$pred, bestmodel$obs))$table
# M R
# M 104 23
# R 7 74
# 95% Confident Interval
## Sensitivity
sens_errors <- sqrt(sensitivity(mtx) * (1 - sensitivity(mtx)) / sum(mtx[,1]))
sensLower <- sensitivity(mtx) - 1.96 * sens_errors
sensUpper <- sensitivity(mtx) + 1.96 * sens_errors
## Specificity
spec_errors <- sqrt(specificity(mtx) * (1 - specificity(mtx)) / sum(mtx[,2]))
specLower <- specificity(mtx) - 1.96 * spec_errors
specUpper <- specificity(mtx) + 1.96 * spec_errors
## Positive Predictive Values
ppv_errors <- sqrt(posPredValue(mtx) * (1 - posPredValue(mtx)) / sum(mtx[1,]))
ppvLower <- posPredValue(mtx) - 1.96 * ppv_errors
ppvUpper <- posPredValue(mtx) + 1.96 * ppv_errors
## Negative Predictive Values
npv_errors <- sqrt(negPredValue(mtx) * (1 - negPredValue(mtx)) / sum(mtx[2,]))
npvLower <- negPredValue(mtx) - 1.96 * npv_errors
npvUpper <- negPredValue(mtx) + 1.96 * npv_errors
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.