在 sklearn 管道中转换估计器的结果

Question

I have a sklearn pipeline that consists of a custom transformer, followed by XGBClassifier.我有一个 sklearn 管道，其中包含一个自定义转换器，然后是 XGBClassifier。 What I would like to add as a final step in the transformer is another custom transformer that transforms the results of the XGBClassifier.我想在转换器中添加的最后一步是另一个自定义转换器，它转换 XGBClassifier 的结果。

This last custom transformer will rank the predicted probabilities into ranks (5-percentiles).最后一个自定义转换器将预测概率排名（5 个百分位数）。

Pipeline([
          ('custom_trsf1', custom_trsf1),
          ('clf', XGBCLassifier()),
          ('custom_trsf2', custom_trsf2)])

The problem is that the sklearn pipeline requires that all steps (but the last) should have a fit and transform method.问题是 sklearn 管道要求所有步骤（但最后一步）都应该有一个 fit and transform 方法。 Can I solve this in another way instead of extending the XGBclassifier and adding a transform method to it?我可以用另一种方式解决这个问题，而不是扩展 XGBclassifier 并向其添加转换方法吗？

Answer 1

From seeing the source code of Pipeline implementation, the estimator used to fit the data goes on the last position of your steps, the _final_estimator property of Pipeline calls the last position of Pipeline's steps.从Pipeline实现的源代码来看，用于拟合数据的估计器在您的步骤的最后一个 position 上，Pipeline 的_final_estimator属性调用了 Pipeline 步骤的最后一个 position。

@property
def _final_estimator(self):
    estimator = self.steps[-1][1]
    return 'passthrough' if estimator is None else estimator

where steps might be something like steps可能类似于

steps = [('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)),
 ('svc',
  SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
      decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
      max_iter=-1, probability=False, random_state=None, shrinking=True,
      tol=0.001, verbose=False))]

The _final_estimator property is just called, after fitting all the transforms one after the other, to get the estimator to be fitted to the model, see line 333 for details. _final_estimator属性只是在一个接一个地拟合所有变换之后调用，以使估计器适合 model，有关详细信息，请参见第333行。

So, considering steps , I can retrieve an SVC class from it's last position所以，考虑到steps ，我可以从它的最后一个 position 中检索一个SVC class

final_estimator = steps[-1][1]
final_estimator
>>> SVC(C=1.0, ..., verbose=False)

and fit it the training data并将其拟合到训练数据中

final_estimator.fit(Xt, y)

where Xt is the transformed training data ( calculated before fitting the estimator) and y the training target.其中Xt是转换后的训练数据（在拟合估计器之前计算）， y是训练目标。

在 sklearn 管道中转换估计器的结果

问题描述

1 个解决方案

解决方案1
0 2020-11-26 18:21:59

在 sklearn 管道中转换估计器的结果

问题描述

1 个解决方案

解决方案1 0 2020-11-26 18:21:59

解决方案1
0 2020-11-26 18:21:59