简体   繁体   English

R随机森林的灵敏度

[英]R random forest by sensitivity

Is it possible to run a supervised classification random forest maximizing sensitivity (TP/(TP+FN))? 是否可以运行监督分类随机森林以最大化灵敏度(TP /(TP + FN))? As far as I know, Accuracy or Kappa are the metric. 据我所知,准确度或Kappa是度量标准。 Below, an real example where both Kappa and Accuracy miss to evaluate the model as desired. 下面是一个真实的示例,其中Kappa和Accuracy均未按要求评估模型。 As pointed in the answer and comments (@Hanjo and @Aaron), sensitivity alone is not a good metric. 正如答案和评论(@Hanjo和@Aaron)中指出的那样,仅凭灵敏度并不是一个好的指标。

      0    1     T  
0  1213   50  1263  
1   608   63   671  
T  1821  113  1934  

> Precisao(prev_table)
[1] "accuracy(TP+TN/T)= 0.66"
[1] "precision(TP/TP+FP)= 0.558"
[1] "sensitivity(TP/TP+FN)= 0.0939"
[1] "positive= 671 0.347"
[1] "negative= 1263 0.653"
[1] "predicted positive= 113 0.0584"
[1] "predicted negative= 1821 0.942"
[1] "Total= 1934"

This real x predicted results are poor to the goal. 这个真实的x预测结果无法达到目标。

let me elaborate for you on why choosing "sensitivity" or "specificity" as the performance metric might not be a good idea, and why I say you must perhaps go for kappa (especially in unbalanced class predictions) 让我为您详细说明为什么选择“敏感性”或“特异性”作为性能指标可能不是一个好主意,以及为什么我说您可能必须参加kappa (尤其是在不平衡的班级预测中)

Imagine we have the following dataset and prediction outcomes: 假设我们有以下数据集和预测结果:

x   Outcome Prediction
0.515925884 1   1
0.416949071 0   1
0.112185499 0   1
0.557334124 0   1
0.599717812 0   1
0.272965861 1   1
0.898911346 0   1
0.347428065 0   1

If the model predicted a 1 on all observations, you would have a 100% sensitivity and would falsely presume that the model was doing well. 如果模型在所有观察结果中预测为1,则您将具有100%的敏感性,并会错误地假定模型运行良好。 The same is true if the model predicted all outcomes as 0, which relates to 100% specificity. 如果模型将所有结果预测为0(与100%特异性相关),则情况也是如此。 But does this mean the model is well tuned? 但这是否意味着模型已经过调整? Obviously not, as a simple rule of 'predicting' all outcomes as true positives will give you specificity of 100%. 显然不是,将所有结果“预测”为真实阳性的简单规则将为您提供100%的特异性。 Now, kappa uses the following measurement of model performance: 现在, kappa使用以下模型性能度量:

The Kappa statistic (or value) is a metric that compares an Observed Accuracy with an Expected Accuracy (random chance) . Kappa统计量(或值)是一种将观察到的准确度与期望的准确度(随机机会)进行比较的度量 This is a much more representative measure of the performance of your model. 这是模型性能的更具代表性的度量。 A nice answer to explain this can be found here Stats Exchange Stats Exchange可以在此处找到解释此问题的好答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM