简体   繁体   English

如何在 sklearn 中使用自定义估计器进行交叉验证?

[英]How to use cross-validation with custom estimator in sklearn?

I have written a custom estimator class with a fit and transform method.我编写了一个带有fittransform方法的自定义估算器类。 I am able to create a model, train and predict using the model.我能够创建一个模型,使用该模型进行训练和预测。

However, while doing cross-validation, I run into this error: TypeError: cannot deepcopy this pattern object .但是,在进行交叉验证时,我遇到了这个错误: TypeError: cannot deepcopy this pattern object

This is how CustomEstimator looks like:这是CustomEstimator样子:

class DefaultEstimator(BaseEstimator, TransformerMixin):
    def __init__(self, preprocessor, pipelines):
      self.preprocessor = preprocessor
      self.pipelines = pipelines

    def fit(self, X, y=None):
      for each_pipeline in self.pipelines:
          each_pipeline.fit(self.preprocessor.apply(X), y)
      return self

   def transform(self, X):
     transformed_data = []
     for each_pipeline in self.pipelines:
        transformed_data.append(each_pipeline.transform(self.preprocessor.apply(X)))
     return sp.hstack(transformed_data)

Does anyone have an idea on approaching this issue?有没有人对解决这个问题有想法?

I would suggest having the preprocessor inside the pipeline itself.我建议在管道内部使用预处理器。 Cross_val_score would try to copy the params of the estimator, it would break when the estimator cannot return the params while calling get_params() . Cross_val_score会尝试复制估算器的参数,当估算器在调用get_params()时无法返回参数时,它会中断。

I am not sure whether your pipeline parameter is a Sklearn pipeline because the pipeline object is not iterable.我不确定您的管道参数是否是 Sklearn 管道,因为管道对象不可迭代。

As suggested in few comments, this error is because self.processor can't be deep-cloned.正如几条评论所建议的,这个错误是因为self.processor不能被深度克隆。

So, the workaround for this error is to remove preprocessing step from this class and move it as independent preprocessing step or inside the pipeline itself.因此,此错误的解决方法是从此类中删除预处理步骤,并将其作为独立的预处理步骤或在管道本身内部移动。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM