简体   繁体   English

R studio - 我需要使用混淆矩阵的灵敏度和特异性以及阳性和阴性预测值的置信区间

[英]R studio - I need the confidence intervals of sensitivity and specificity and positive and negative predictive values using confusion matrix

I am writing a paper about the validity of a billing code in hospitalized children.我正在写一篇关于住院儿童账单代码有效性的论文。 I am a very novice R studio user.我是一个非常新手的 R 工作室用户。 I need the confidence intervals for the sensitive and specificity and positive and negative predictive values but I can't figure out how to do it.我需要敏感度和特异性以及阳性和阴性预测值的置信区间,但我不知道该怎么做。

My data has 3 columns: ID, true value, billing value我的数据有 3 列: ID, true value, billing value

Here is my code:这是我的代码:

confusionMatrix(table(finalcodedataset$billing_value, finalcodedataset$true_value), 
                positive="1", boot=TRUE, boot_samples=4669, alpha=0.05)

here is the output:这是 output:

Confusion Matrix and Statistics混淆矩阵和统计

       0    1
  0 4477  162

  1   10   20

               Accuracy : 0.9632          
                 95% CI : (0.9574, 0.9684)
    No Information Rate : 0.961           
    P-Value [Acc > NIR] : 0.238           

                  Kappa : 0.1796          
 Mcnemar's Test P-Value : <2e-16          

            Sensitivity : 0.109890        
            Specificity : 0.997771        
         Pos Pred Value : 0.666667        
         Neg Pred Value : 0.965079        
             Prevalence : 0.038981        
         Detection Rate : 0.004284        
   Detection Prevalence : 0.006425        
      Balanced Accuracy : 0.553831        

       'Positive' Class : 1   

Caret and other packages use the Clopper-Pearson Interval method to calculate the confidence interval. Caret 和其他包使用Clopper-Pearson Interval方法来计算置信区间。

I consider your 2x2 reversed since the TP (True Positive) is on the bottom right.我认为你的 2x2 反转了,因为 TP(真阳性)在右下角。 If the TP is at the top left then variables (A,B,C,D) would be switched.如果 TP 在左上角,则变量 (A,B,C,D) 将被切换。

D = 4477
C = 162
B = 10
A = 20

Acc = (A+D)/(A+B+C+D)
Sensitivity = A / (A + C)
Specificity = D / (D + B)
P = (A+C)/(A+B+C+D)
PPV = (Sensitivity*P)/((Sensitivity*P)+((1-Specificity)*(1-P)))
NPV = (Specificity*(1-P))/(((1 - Sensitivity)*P)+((Specificity)*(1-P)))

n = A+B+C+D
x = n - (A+D)
alpha = 0.05

ub = 1 - ((1 + (n - x + 1)/ (x * qf(alpha *.5, 2*x, 2*(n - x + 1))))^-1)
lb = 1 - ((1 + (n - x) / ((x + 1)* qf(1-(alpha*.5), 2*(x+1), 2*(n-x))))^-1)
CI = c(lb,ub)

> Acc
[1] 0.9631613
> CI
[1] 0.9573536 0.9683800
> Sensitivity
[1] 0.1098901
> Specificity
[1] 0.9977713
[1] 0.6666667
[1] 0.9650787

Here is also a good resource for where these formulas come from.对于这些公式的来源, 这里也是一个很好的资源。

You can use epiR package for this purpouse.您可以为此目的使用 EpiR package。


data <- as.table(matrix(c(670,202,74,640), nrow = 2, byrow = TRUE))
rval <- epi.tests(data, conf.level = 0.95)

          Outcome +    Outcome -      Total
Test +          670          202        872
Test -           74          640        714
Total           744          842       1586

Point estimates and 95 % CIs:
Apparent prevalence                    0.55 (0.52, 0.57)
True prevalence                        0.47 (0.44, 0.49)
Sensitivity                            0.90 (0.88, 0.92)
Specificity                            0.76 (0.73, 0.79)
Positive predictive value              0.77 (0.74, 0.80)
Negative predictive value              0.90 (0.87, 0.92)
Positive likelihood ratio              3.75 (3.32, 4.24)
Negative likelihood ratio              0.13 (0.11, 0.16)

The following reproducible example is partially inspired from ROC curve from training data in caret .以下可重现的示例的部分灵感来自于 caret 中训练数据的 ROC 曲线


ctrl <- trainControl(method = "cv", summaryFunction = twoClassSummary, classProbs = TRUE, savePredictions = TRUE)
fit1 <- train(Class ~ ., data = Sonar,method = "rf",trControl = ctrl)

bestmodel <- merge(fit1$bestTune, fit1$pred)
mtx <- confusionMatrix(table(bestmodel$pred, bestmodel$obs))$table

 #     M   R
 # M 104  23
 # R   7  74

# 95% Confident Interval 

## Sensitivity
sens_errors <- sqrt(sensitivity(mtx) * (1 - sensitivity(mtx)) / sum(mtx[,1]))
sensLower <- sensitivity(mtx) - 1.96 * sens_errors
sensUpper <- sensitivity(mtx) + 1.96 * sens_errors

## Specificity
spec_errors <- sqrt(specificity(mtx) * (1 - specificity(mtx)) / sum(mtx[,2]))
specLower <- specificity(mtx) - 1.96 * spec_errors
specUpper <- specificity(mtx) + 1.96 * spec_errors

## Positive Predictive Values
ppv_errors <- sqrt(posPredValue(mtx) * (1 - posPredValue(mtx)) / sum(mtx[1,]))
ppvLower <- posPredValue(mtx) - 1.96 * ppv_errors
ppvUpper <- posPredValue(mtx) + 1.96 * ppv_errors

## Negative Predictive Values
npv_errors <- sqrt(negPredValue(mtx) * (1 - negPredValue(mtx)) / sum(mtx[2,]))
npvLower <- negPredValue(mtx) - 1.96 * npv_errors
npvUpper <- negPredValue(mtx) + 1.96 * npv_errors

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用混淆矩阵的灵敏度和特异性的 95% 置信区间代码 - CODE FOR 95% confidence intervals for sensitivity and specificity using confusion matrix R Confusion Matrix敏感性和特异性标记 - R Confusion Matrix sensitivity and specificity labeling 是否可以从R中的混淆矩阵中检索假阳性和假阴性? - Is it possible to retrieve false positive and false negative from confusion matrix in R? R 中有一个函数可以将一个大的混淆矩阵减少到 R 中的 2x2 正负级混淆矩阵吗? - Is there a function in R to reduce a big confusion matrix to a 2x2 positive-negative level confusion matrix in R? R:如何为预测 model 制作混淆矩阵? - R: how to make a confusion matrix for a predictive model? R 中的敏感性和特异性 - Sensitivity and Specificity in R 使用R中的For循环匹配负值和正值 - Matching negative and positive values using For Loop in R 使用两种不同的R包(Caret和pROC),灵敏度和特异性值不同 - Sensitivity and specificity values differ using two different R package( Caret and pROC) 使用 predictNLS 围绕 R 中的拟合值创建置信区间? - Using predictNLS to create confidence intervals around fitted values in R? 混淆矩阵敏感性和特异性长度匹配,但数据的级别不能多于参考 - Confusion Matrix Sensitivity & Specificity length match but data cannot have more levels than reference
粤ICP备18138465号  © 2020-2024 STACKOOM.COM