It is common to scale the training and testing data separately before training and predicting progress of a classification task.
I want to embed the aforementioned process in RFECV
which runs CV tests thus I tried the following:
Do X_scaled = preprocessing.scale(X)
in the first place, where X
is the whole data set. By doing so, training and testing data are not scaled separately, which is not considered.
The other way I tried is to pass:
scaling_svm = Pipeline([('scaler', preprocessing.StandardScaler()),
('svm',LinearSVC(penalty=penalty, dual=False, class_weight='auto'))])
as parameter to the argument in RFECV
:
rfecv = RFECV(estimator=scaling_svm, step=1, cv=StratifiedKFold(y, 7),
scoring=score, verbose=0)
However, I got an error since RFECV
needs the estimator
to have attribute .coef_
. What should I suppose to do? Any help would be appreciated.
A bit late to the party, admittedly, but if anyone is interested you can create a customised pipeline as follows:
from sklearn.pipeline import Pipeline
class RfePipeline(Pipeline):
@property
def coef_(self):
return self._final_estimator.coef_
And then replace Pipeline
with RfePipeline
in your code.
See similar question here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.