[英]Transform results of estimator in a sklearn pipeline
I have a sklearn pipeline that consists of a custom transformer, followed by XGBClassifier.我有一个 sklearn 管道,其中包含一个自定义转换器,然后是 XGBClassifier。 What I would like to add as a final step in the transformer is another custom transformer that transforms the results of the XGBClassifier.
我想在转换器中添加的最后一步是另一个自定义转换器,它转换 XGBClassifier 的结果。
This last custom transformer will rank the predicted probabilities into ranks (5-percentiles).最后一个自定义转换器将预测概率排名(5 个百分位数)。
Pipeline([
('custom_trsf1', custom_trsf1),
('clf', XGBCLassifier()),
('custom_trsf2', custom_trsf2)])
The problem is that the sklearn pipeline requires that all steps (but the last) should have a fit and transform method.问题是 sklearn 管道要求所有步骤(但最后一步)都应该有一个 fit and transform 方法。 Can I solve this in another way instead of extending the XGBclassifier and adding a transform method to it?
我可以用另一种方式解决这个问题,而不是扩展 XGBclassifier 并向其添加转换方法吗?
From seeing the source code of Pipeline implementation, the estimator used to fit the data goes on the last position of your steps, the _final_estimator
property of Pipeline calls the last position of Pipeline's steps.从Pipeline实现的源代码来看,用于拟合数据的估计器在您的步骤的最后一个 position 上,Pipeline 的
_final_estimator
属性调用了 Pipeline 步骤的最后一个 position。
@property
def _final_estimator(self):
estimator = self.steps[-1][1]
return 'passthrough' if estimator is None else estimator
where steps
might be something like steps
可能类似于
steps = [('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)),
('svc',
SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False))]
The _final_estimator
property is just called, after fitting all the transforms one after the other, to get the estimator to be fitted to the model, see line 333 for details. _final_estimator
属性只是在一个接一个地拟合所有变换之后调用,以使估计器适合 model,有关详细信息,请参见第333行。
So, considering steps
, I can retrieve an SVC
class from it's last position所以,考虑到
steps
,我可以从它的最后一个 position 中检索一个SVC
class
final_estimator = steps[-1][1]
final_estimator
>>> SVC(C=1.0, ..., verbose=False)
and fit it the training data并将其拟合到训练数据中
final_estimator.fit(Xt, y)
where Xt
is the transformed training data ( calculated before fitting the estimator) and y
the training target.其中
Xt
是转换后的训练数据(在拟合估计器之前计算), y
是训练目标。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.