I am using the sklearn RandomForestClassifier
as my classification. I could not figure out how to get evaluate Overfitting and Underfitting for sklearn models.
model = RandomForestClassifier(n_estimators=1000, random_state=1, criterion='entropy', bootstrap=True, oob_score=True, verbose=1)
model.fit(X_train, y_train)
Currently, I am using other metrics to evaluate my model like - cross_val_score, confusion_matrix, classification_report, PermutationImportance. Could someone please help me with this.
There are multiple ways you can test overfitting and underfitting. If you want to look specifically at train and test scores and compare them you can do this with sklearns cross_validate . If you read the documentation it will return you a dictionary with train scores (if supplied as train_score=True) and test scores in metrics that you supply.
sample code
model = RandomForestClassifier(n_estimators=1000, random_state=1, criterion='entropy', bootstrap=True, oob_score=True, verbose=1)
cv_dict = cross_validate(model, X, y, return_train_score=True)
You can also simply create a hold out test set with train test split and compare your training and test scores using the test data set.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.