R studio - 我需要使用混淆矩阵的灵敏度和特异性以及阳性和阴性预测值的置信区间

Question

我正在写一篇关于住院儿童账单代码有效性的论文。 我是一个非常新手的 R 工作室用户。 我需要敏感度和特异性以及阳性和阴性预测值的置信区间，但我不知道该怎么做。

我的数据有 3 列： ID, true value, billing value

这是我的代码：

confusionMatrix(table(finalcodedataset$billing_value, finalcodedataset$true_value), 
                positive="1", boot=TRUE, boot_samples=4669, alpha=0.05)

这是 output：

混淆矩阵和统计

       0    1
  0 4477  162

  1   10   20

               Accuracy : 0.9632          
                 95% CI : (0.9574, 0.9684)
    No Information Rate : 0.961           
    P-Value [Acc > NIR] : 0.238           

                  Kappa : 0.1796          
 Mcnemar's Test P-Value : <2e-16          

            Sensitivity : 0.109890        
            Specificity : 0.997771        
         Pos Pred Value : 0.666667        
         Neg Pred Value : 0.965079        
             Prevalence : 0.038981        
         Detection Rate : 0.004284        
   Detection Prevalence : 0.006425        
      Balanced Accuracy : 0.553831        

       'Positive' Class : 1

Answer 1

Caret 和其他包使用Clopper-Pearson Interval方法来计算置信区间。

我认为你的 2x2 反转了，因为 TP（真阳性）在右下角。 如果 TP 在左上角，则变量 (A,B,C,D) 将被切换。

D = 4477
C = 162
B = 10
A = 20

Acc = (A+D)/(A+B+C+D)
Sensitivity = A / (A + C)
Specificity = D / (D + B)
P = (A+C)/(A+B+C+D)
PPV = (Sensitivity*P)/((Sensitivity*P)+((1-Specificity)*(1-P)))
NPV = (Specificity*(1-P))/(((1 - Sensitivity)*P)+((Specificity)*(1-P)))

n = A+B+C+D
x = n - (A+D)
alpha = 0.05

ub = 1 - ((1 + (n - x + 1)/ (x * qf(alpha *.5, 2*x, 2*(n - x + 1))))^-1)
lb = 1 - ((1 + (n - x) / ((x + 1)* qf(1-(alpha*.5), 2*(x+1), 2*(n-x))))^-1)
CI = c(lb,ub)

> Acc
[1] 0.9631613
> CI
[1] 0.9573536 0.9683800
> Sensitivity
[1] 0.1098901
> Specificity
[1] 0.9977713
> PPV
[1] 0.6666667
> NPV
[1] 0.9650787

对于这些公式的来源，这里也是一个很好的资源。

Answer 2

您可以为此目的使用 EpiR package。

例子：

library(epiR)
data <- as.table(matrix(c(670,202,74,640), nrow = 2, byrow = TRUE))
rval <- epi.tests(data, conf.level = 0.95)
print(rval)

          Outcome +    Outcome -      Total
Test +          670          202        872
Test -           74          640        714
Total           744          842       1586

Point estimates and 95 % CIs:
---------------------------------------------------------
Apparent prevalence                    0.55 (0.52, 0.57)
True prevalence                        0.47 (0.44, 0.49)
Sensitivity                            0.90 (0.88, 0.92)
Specificity                            0.76 (0.73, 0.79)
Positive predictive value              0.77 (0.74, 0.80)
Negative predictive value              0.90 (0.87, 0.92)
Positive likelihood ratio              3.75 (3.32, 4.24)
Negative likelihood ratio              0.13 (0.11, 0.16)
---------------------------------------------------------

Answer 3

以下可重现的示例的部分灵感来自于 caret 中训练数据的 ROC 曲线。

library(MLeval)
library(caret)
library(pROC)

data(Sonar)
ctrl <- trainControl(method = "cv", summaryFunction = twoClassSummary, classProbs = TRUE, savePredictions = TRUE)
set.seed(42)
fit1 <- train(Class ~ ., data = Sonar,method = "rf",trControl = ctrl)


bestmodel <- merge(fit1$bestTune, fit1$pred)
mtx <- confusionMatrix(table(bestmodel$pred, bestmodel$obs))$table

 #     M   R
 # M 104  23
 # R   7  74

# 95% Confident Interval 

## Sensitivity
sens_errors <- sqrt(sensitivity(mtx) * (1 - sensitivity(mtx)) / sum(mtx[,1]))
sensLower <- sensitivity(mtx) - 1.96 * sens_errors
sensUpper <- sensitivity(mtx) + 1.96 * sens_errors


## Specificity
spec_errors <- sqrt(specificity(mtx) * (1 - specificity(mtx)) / sum(mtx[,2]))
specLower <- specificity(mtx) - 1.96 * spec_errors
specUpper <- specificity(mtx) + 1.96 * spec_errors

## Positive Predictive Values
ppv_errors <- sqrt(posPredValue(mtx) * (1 - posPredValue(mtx)) / sum(mtx[1,]))
ppvLower <- posPredValue(mtx) - 1.96 * ppv_errors
ppvUpper <- posPredValue(mtx) + 1.96 * ppv_errors


## Negative Predictive Values
npv_errors <- sqrt(negPredValue(mtx) * (1 - negPredValue(mtx)) / sum(mtx[2,]))
npvLower <- negPredValue(mtx) - 1.96 * npv_errors
npvUpper <- negPredValue(mtx) + 1.96 * npv_errors

R studio - 我需要使用混淆矩阵的灵敏度和特异性以及阳性和阴性预测值的置信区间

问题描述

3 个解决方案

解决方案1
1 2020-12-02 22:02:55

解决方案2
1 2021-04-11 18:40:01

解决方案3
0 2020-07-27 09:25:29

R studio - 我需要使用混淆矩阵的灵敏度和特异性以及阳性和阴性预测值的置信区间

问题描述

3 个解决方案

解决方案1 1 2020-12-02 22:02:55

解决方案2 1 2021-04-11 18:40:01

解决方案3 0 2020-07-27 09:25:29

解决方案1
1 2020-12-02 22:02:55

解决方案2
1 2021-04-11 18:40:01

解决方案3
0 2020-07-27 09:25:29