[英]sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV
I am using GridSearchCV with a pipeline as follows: 我正在使用带有管道的GridSearchCV,如下所示:
grid = GridSearchCV(
Pipeline([
('reduce_dim', PCA()),
('classify', RandomForestClassifier(n_jobs = -1))
]),
param_grid=[
{
'reduce_dim__n_components': range(0.7,0.9,0.1),
'classify__n_estimators': range(10,50,5),
'classify__max_features': ['auto', 0.2],
'classify__min_samples_leaf': [40,50,60],
'classify__criterion': ['gini', 'entropy']
}
],
cv=5, scoring='f1')
grid.fit(X,y)
How do I now retrieve PCA details like components
and explained_variance
from the grid.best_estimator_
model? 我现在该如何找回PCA细节,如components
和explained_variance
从grid.best_estimator_
模式?
Furthermore, I also want to save the best_estimator_
to a file using pickle and later load it. 此外,我还想使用pickle将best_estimator_
保存到文件中,然后加载它。 How do I retrieve the PCA details from this loaded estimator? 如何从此加载的估算器中检索PCA详细信息? I suspect it will be the same as above. 我怀疑它会和上面一样。
grid.best_estimator_
is to access the pipeline with the best parameters. grid.best_estimator_
用于访问具有最佳参数的管道。
Now use named_steps[]
attribute to access the internal estimators of the pipeline. 现在使用named_steps[]
属性来访问管道的内部估算器。
So grid.best_estimator_.named_steps['reduce_dim']
will give you the pca
object. 所以grid.best_estimator_.named_steps['reduce_dim']
会给你pca
对象。 Now you can simply use this to access the components_
and explained_variance_
attibutes for this pca object like this: 现在,你可以简单地使用它来访问components_
和explained_variance_
attibutes像这样此PCA对象:
grid.best_estimator_.named_steps['reduce_dim'].components_
grid.best_estimator_.named_steps['reduce_dim'].explained_variance_
grid.best_estimator_.named_steps['reduce_dim'].components_
grid.best_estimator_.named_steps['reduce_dim'].explained_variance_
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.