[英]How to use cross-validation with custom estimator in sklearn?
I have written a custom estimator class with a fit
and transform
method.我编写了一个带有
fit
和transform
方法的自定义估算器类。 I am able to create a model, train and predict using the model.我能够创建一个模型,使用该模型进行训练和预测。
However, while doing cross-validation, I run into this error: TypeError: cannot deepcopy this pattern object
.但是,在进行交叉验证时,我遇到了这个错误:
TypeError: cannot deepcopy this pattern object
。
This is how CustomEstimator
looks like:这是
CustomEstimator
样子:
class DefaultEstimator(BaseEstimator, TransformerMixin):
def __init__(self, preprocessor, pipelines):
self.preprocessor = preprocessor
self.pipelines = pipelines
def fit(self, X, y=None):
for each_pipeline in self.pipelines:
each_pipeline.fit(self.preprocessor.apply(X), y)
return self
def transform(self, X):
transformed_data = []
for each_pipeline in self.pipelines:
transformed_data.append(each_pipeline.transform(self.preprocessor.apply(X)))
return sp.hstack(transformed_data)
Does anyone have an idea on approaching this issue?有没有人对解决这个问题有想法?
I would suggest having the preprocessor inside the pipeline itself.我建议在管道内部使用预处理器。
Cross_val_score
would try to copy the params of the estimator, it would break when the estimator cannot return the params while calling get_params()
. Cross_val_score
会尝试复制估算器的参数,当估算器在调用get_params()
时无法返回参数时,它会中断。
I am not sure whether your pipeline parameter is a Sklearn pipeline because the pipeline object is not iterable.我不确定您的管道参数是否是 Sklearn 管道,因为管道对象不可迭代。
As suggested in few comments, this error is because self.processor
can't be deep-cloned.正如几条评论所建议的,这个错误是因为
self.processor
不能被深度克隆。
So, the workaround for this error is to remove preprocessing step from this class and move it as independent preprocessing step or inside the pipeline itself.因此,此错误的解决方法是从此类中删除预处理步骤,并将其作为独立的预处理步骤或在管道本身内部移动。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.