![](/img/trans.png)
[英]How to use Random Undersampler with ratio = 'dict' in imblearn?
[英]How to use imblearn undersampler in pipeline?
我有以下管道结构:
from imblearn.under_sampling import RandomUnderSampler
from imblearn.pipeline import Pipeline
sel = SelectKBest(k='all',score_func=chi2)
under = RandomUnderSampler(sampling_strategy=0.2)
preprocessor = ColumnTransformer(transformers=[('num', numeric_transformer, numeric_cols)])
final_pipe = Pipeline(steps=[('sample',under),('preprocessor', preprocessor),('var',VarianceThreshold()),('sel',sel),('clf', model)])
但是我收到以下错误:
TypeError: All intermediate steps of the chain should be estimators that implement fit and transform or fit_resample (but not both) or be a string 'passthrough' '<class 'sklearn.compose._column_transformer.make_column_selector'>' (type <class 'type'>) doesn't)
我不明白我做错了什么? 有人可以帮忙吗?
@Math12,我最近遇到了同样的问题,我解决它的方法是将 RandomUnderSampler() 包装在自定义 function 中,然后由 FunctionTransformer 进一步转换。
我已将您的代码改写为这样,并且有效。
下面是代码示例的片段
from imblearn.under_sampling import RandomUnderSampler
from imblearn.pipeline import Pipeline
sel = SelectKBest(k='all',score_func=chi2)
preprocessor = ColumnTransformer(transformers=[('num', numeric_transformer, numeric_cols)])
def Data_Preprocessing_3(df):
# fit random under sampler on the train data
rus = RandomUnderSampler(sampling_strategy=0.2)
df = rus.fit_resample(df)
return df
# in a separate code line outside the above function, transform the function with a FunctionTransformer
under = FunctionTransformer(Data_Preprocessing_3)
#implement your pipeline as done initially
final_pipe = Pipeline(steps=[('sample',under),('preprocessor', preprocessor),('var',VarianceThreshold()),('sel',sel),('clf', model)])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.