簡體 English 中英

如何提高分類的 F1 分數

[英]How to improve F1 score for classification

原文 2020-07-01 08:36:33 2 2 python/ performance/ classification/ grid-search/ ensemble-learning

我正在預測是否有任何任務違反給定的截止日期（二進制分類問題）

我使用過邏輯回歸、隨機森林和 XGBoost。 對於 class label 1（即僅正 class 的 F1 分數），所有這些都給出了大約 56% 的 F1 分數。

我用過：

標准縮放器（）
用於超參數調整的 GridSearchCV
遞歸特征消除（用於特征選擇）
SMOTE（數據集不平衡，所以我使用 SMOTE 從現有示例創建新示例）

嘗試提高這個 model 的 F 分數。

I've also created an ensemble model using EnsembleVoteClassifier .As you can see from the picture, the weighted F score is 94% however the F score for class 1 (ie positive class which says that the task will cross the deadline) is just 57 %。

在應用上述所有這些方法后，我已經能夠將 label 1 的 f1 分數從 6% 提高到 57%。 但是，我不確定還可以做些什么來進一步提高 label 1 的 F 分數。

2 個解決方案

顯然，數據集中的 True 1s 樣本數量相對較少這一事實會影響分類器的性能。

你有一個“不平衡的數據”，你有更多的 0s 樣本而不是 1s。 有多種方法可以處理不平衡的數據。 您申請的每個學習者都有自己的“技巧”。 但是，您可以嘗試的一般方法是重新采樣 1s 樣本。 也就是說，人為地增加數據集中 1 的比例。

您可以在此處閱讀有關不同選項的更多信息： https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18

您還應該嘗試使用欠采樣。 一般來說，簡單地改變算法不會有太大的改進。 您應該研究專門為處理 class 不平衡而設計的更高級的基於集成的技術。

也可以試試本文使用的方法： https://www.sciencedirect.com/science/article/abs/pii/S0031320312001471

或者，您可以研究更高級的數據合成方法。

如何提高 CNN 分類中的 F1-score

[英]How to improve the F1-score in CNN classification

F1 分數指標和分類報告的 F1 分數值不同 sklearn

[英]F1 score values different for F1 score metric and classification report sklearn

進行多標簽分類時，准確性和F1分數相同

[英]Same accuracy and F1 score while doing multi label classification

在 PyTorch 中本地測量多類分類的 F1 分數

[英]Measuring F1 score for multiclass classification natively in PyTorch

計算多 label 分類 keras 的召回精度和 F1 分數

[英]compute the recall precision and F1 score for a multi label classification keras

如何在 Keras 模型中使用 F1 Score？

[英]How to use F1 Score with Keras model?

為什么 sklearns 分類報告中的“加權”平均 F1 分數與根據公式計算的 F1 分數不同？

[英]Why is the 'weighted' average F1 score from sklearns classification report different from the F1 score calculated from the formula?

如何使用 Sklearn 的 cross_validation（多標簽分類）獲得每個標簽的 F1 分數

[英]How to get F1 score per label using Sklearn's cross_validation (multi-label classification)

使用 keras ImageDataGenerator - AttributeError: 'DirectoryIterator' object has no attribute 'argmax' 時如何獲取分類報告（F1 分數）

[英]How to get classification report (F1 score) when using keras ImageDataGenerator - AttributeError: 'DirectoryIterator' object has no attribute 'argmax'

測試數據的F1分數

[英]f1 score for test data

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 如何提高 CNN 分類中的 F1-score F1 分數指標和分類報告的 F1 分數值不同 sklearn 進行多標簽分類時，准確性和F1分數相同在 PyTorch 中本地測量多類分類的 F1 分數計算多 label 分類 keras 的召回精度和 F1 分數如何在 Keras 模型中使用 F1 Score？為什么 sklearns 分類報告中的“加權”平均 F1 分數與根據公式計算的 F1 分數不同？如何使用 Sklearn 的 cross_validation（多標簽分類）獲得每個標簽的 F1 分數使用 keras ImageDataGenerator - AttributeError: 'DirectoryIterator' object has no attribute 'argmax' 時如何獲取分類報告（F1 分數）測試數據的F1分數

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM