简体   繁体   English

如何在scikit的管道中添加过采样/欠采样程序?

[英]How to add oversampling/undersampling procedure in scikit's Pipeline?

I would like to add oversampling procedure, like SMOTE oversampling , to scikit's Pipeline . 我想将过采样程序(如SMOTE过采样)添加到scikit的Pipeline中 But the transformers only supports fit and transform method, and do not provide a way to increase the number of samples and targets. 但是变换器只支持fittransform方法,并没有提供增加样本和目标数量的方法。

One possible way to do this is to break the pipeline to two separate pipelines connected by SMOTE sampling. 一种可能的方法是将管道分成两个由SMOTE采样连接的独立管道。

Is there any better solutions? 有没有更好的解决方案?

Our current Pipeline does not support changing the number of samples between steps as the Transformer.transform method does not return the y argument that would need to also be resampled. 我们当前的Pipeline不支持更改步骤之间的样本数,因为Transformer.transform方法不返回需要重新采样的y参数。 This is a know limitation of the current design. 这是当前设计的已知限制。 It might be fixed in a future version but we have not started to work on that yet. 它可能会在未来版本中修复,但我们还没有开始研究它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 不平衡数据:欠采样或过采样? - Imbalanced data: undersampling or oversampling? 过采样和欠采样以及平衡的数据 - Oversampling and undersampling and balanced data Scikit 的流水线 - 如何访问特定阶段的结果 - Scikit's Pipeline - how to access the results of a particular stage 如何在 scikit-learn 管道中向 Keras 网络添加纪元 - How to add epochs to Keras network in scikit-learn pipeline scikit-学习不平衡数据的欠采样以进行交叉验证 - scikit-learn undersampling of unbalanced data for crossvalidation 在 PySpark 管道中使用交叉验证进行过采样 - Oversampling with Cross Validation in PySpark Pipeline 是否可以将 TransformedTargetRegressor 添加到 scikit-learn 管道中? - Is it possible to add TransformedTargetRegressor into a scikit-learn pipeline? 如何修改此代码以实现 Smote 过采样和交叉验证管道以解决多类分类问题? - How to revise this code for implementing Smote Oversampling and Cross Validation pipeline to multiclass classification problem? 如何将文本添加到 scikit-learn 的混淆矩阵? - How to add text to scikit-learn's confusion matrix? 如何在 scikit-learn 的“pipeline”中使用自定义特征选择功能 - How can I use a custom feature selection function in scikit-learn's `pipeline`
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM