简体   繁体   中英

Question on tuning hyper-parameters with scikit-learn GridSearchCV

Does performing grid search on hyper-parameters guarantee improved performance when tested on the same data set?

I ask because my intuition was "yes", however I got slightly lower scores after tuning my regularization constant:

classifier_os = LinearModel.LogisticRegression()

p_grid = {
    'C': np.logspace(-3, 3, 7)
}

clf = model_selection.GridSearchCV(classifier_os, p_grid, scoring='accuracy')
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
metrics.classification_report(y_pred, y_test, output_dict=True)

Gives me the following scores:

accuracy :  0.8218181818181818
 macro avg: 
     precision :  0.8210875331564986
     recall :  0.8213603058298822
     f1-score :  0.8212129655428624
     support :  275

As compared to before tuning:

accuracy :  0.8290909090909091
 macro avg: 
     precision :  0.8287798408488063
     recall :  0.8285358354537744
     f1-score :  0.8286468069310212

The only thing that the tuning changed was to make the regularization constant 10 instead of the default 1

The GridSearhCV by default if not specified performs a 5-fold CV and returns a scoring. Sometimes, accuracy returned as an average might not be a good one to look at, so F1 is a good choice. To add, the function also outputs best_params , best_score . You would use the best_params obtained into the final model to test how well it does after tunning.

Reference:
Grid Search Sklearn

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM