繁体   English   中英

如何从以下结果中找到最佳 ML model?

[英]How can I find the best ML model from the results below?

我试图通过在借贷俱乐部数据集上训练 model 来预测贷款违约者。 我发现很难从获得的结果中选择 model。 我怎样才能选择合适的?

以下结果来自不同的模型:

--------------------------------------------------
random forest
--------------------------------------------------
              precision    recall  f1-score   support

         0.0       0.75      0.94      0.83      3401
         1.0       0.94      0.74      0.83      4125

    accuracy                           0.83      7526
   macro avg       0.84      0.84      0.83      7526
weighted avg       0.85      0.83      0.83      7526

Confusion Matrix:  
 [[3196  205]  
 [1081 3044]]
Training Accuracy:  0.9854712969525159  
Testing Accuracy:  0.8291256975817167  
Prediction with data having all values as 0:  Counter({0.0: 468, 1.0: 32})  
Prediction with data having all values as 1:  Counter({1.0: 365, 0.0: 135})

--------------------------------------------------
logistic regression
--------------------------------------------------
              precision    recall  f1-score   support

         0.0       0.76      0.83      0.79      3401
         1.0       0.85      0.78      0.81      4125

    accuracy                           0.80      7526
   macro avg       0.80      0.81      0.80      7526
weighted avg       0.81      0.80      0.80      7526

Training Accuracy:  0.7995659107016301  
Testing Accuracy:  0.8037470103640713  
Confusion Matrix:  
 [[2828  573]  
 [ 904 3221]]  
Prediction with data having all values as 0:  Counter({0.0: 406, 1.0: 94})  
Prediction with data having all values as 1:  Counter({1.0: 379, 0.0: 121})
--------------------------------------------------
k nearest neighbor
--------------------------------------------------
              precision    recall  f1-score   support

         0.0       0.73      0.94      0.82      3401
         1.0       0.93      0.72      0.81      4125

    accuracy                           0.82      7526
   macro avg       0.83      0.83      0.82      7526
weighted avg       0.84      0.82      0.82      7526

Training Accuracy:  0.8770818568391212
Testing Accuracy:  0.8161041722030294
Confusion Matrix:
 [[3188  213]
 [1171 2954]]
Prediction with data having all values as 0:  Counter({0.0: 460, 1.0: 40})  
Prediction with data having all values as 1:  Counter({1.0: 353, 0.0: 147})
--------------------------------------------------
cat boost
--------------------------------------------------
              precision    recall  f1-score   support

         0.0       0.75      0.98      0.85      3401
         1.0       0.98      0.72      0.83      4125

    accuracy                           0.84      7526
   macro avg       0.86      0.85      0.84      7526
weighted avg       0.87      0.84      0.84      7526

Training Accuracy:  0.8628632175761871
Testing Accuracy:  0.8388254052617592
Confusion Matrix:
 [[3325   76]
 [1137 2988]]
Prediction with data having all values as 0:  Counter({0.0: 485, 1.0: 15})
Prediction with data having all values as 1:  Counter({1.0: 365, 0.0: 135})
--------------------------------------------------
xgboost
--------------------------------------------------
              precision    recall  f1-score   support

         0.0       0.74      1.00      0.85      3401
         1.0       1.00      0.71      0.83      4125

    accuracy                           0.84      7526
   macro avg       0.87      0.86      0.84      7526
weighted avg       0.88      0.84      0.84      7526

Training Accuracy:  0.8437278525868178
Testing Accuracy:  0.8417486048365665
Confusion Matrix:
 [[3393    8]
 [1183 2942]]
Prediction with data having all values as 0:  Counter({0.0: 497, 1.0: 3})
Prediction with data having all values as 1:  Counter({1.0: 357, 0.0: 143})
--------------------------------------------------

您的最后一个 model 似乎是最好的:它在所有指标(准确度、精度、召回率和 f1 分数)上都给出了最高分。 唯一不重要的分数是对训练集的评估(我们正在对测试集进行评估)。

通常,您希望所有指标都具有最高值,但有时这是不可能的,您需要根据您要实现的目标了解所有指标对 select 和 model 的含义。 你经常需要找到一个权衡。 请注意,f1-score 基于精度和召回率,因此高 f1-score 意味着高精度和召回率。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM