简体   繁体   English

QuantileTransformer 和 PowerTransformer 的区别

[英]Differences between QuantileTransformer and PowerTransformer

In sklearn, the document of QuantileTransformer says在 sklearn 中,QuantileTransformer 的文档说

This method transforms the features to follow a uniform or a normal distribution该方法将特征转换为遵循均匀或正态分布

the document of PowerTransformer says, PowerTransformer 的文件说,

Apply a power transform featurewise to make data more Gaussian-like应用幂变换特征使数据更像高斯

It seems both of them can transform features to a gaussian/normal distribution.似乎它们都可以将特征转换为高斯/正态分布。 What are the differences in terms of this aspect and when to use which?在这方面以及何时使用哪个方面有什么区别?

It is confusing terminology that they use because Gaussian and normal distribution are actually the SAME.他们使用的术语令人困惑,因为高斯分布和正态分布实际上是相同的。

QuantileTransformer and PowerTransformer are both non-linear. QuantileTransformer 和 PowerTransformer 都是非线性的。

To answer your question about what exactly is the difference it is this according to https://scikit-learn.org :要根据https://scikit-learn.org回答您关于究竟有什么区别的问题:

"QuantileTransformer provides non-linear transformations in which distances between marginal outliers and inliers are shrunk. PowerTransformer provides non-linear transformations in which data is mapped to a normal distribution to stabilize variance and minimize skewness. " “QuantileTransformer 提供非线性变换,其中边缘异常值和内点之间的距离缩小。PowerTransformer 提供非线性变换,其中数据映射到正态分布以稳定方差并最小化偏度。”

Source and more info here: https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html#:~:text=QuantileTransformer%20provides%20non%2Dlinear%20transformations,stabilize%20variance%20and%20minimize%20skewness .来源和更多信息在这里: https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html#:~:text=QuantileTransformer%20provides%20non%2Dlinear%20transformations,stabilize%20variance%20and%20minimize%20skewness .

The main difference is PowerTransformer() being parametric and QuantileTransformer() being non-parametric.主要区别在于PowerTransformer()是参数化的,而QuantileTransformer()是非参数化的。 Box-Cox or Yeo-Johnson will make your data look more 'normal' (ie less skewed and more centered) but it's often still far from the perfect gaussian. Box-Cox 或 Yeo-Johnson 将使您的数据看起来更“正常”(即不那么偏斜且更居中),但它通常与完美的高斯相距甚远。 QuantileTransformer(output_distribution='normal') results usually look much closer to gaussian, at the cost of distorting linear relationships somewhat more. QuantileTransformer(output_distribution='normal')结果通常看起来更接近高斯分布,但代价是线性关系扭曲得更多。 I believe there's no rule of thumb to decide which one would work better in a certain case, but it's worth noting you can select an optimal scaler in a pipeline when doing eg GridSearchCV() .我相信没有经验法则可以决定在特定情况下哪个会更好,但值得注意的是,在执行GridSearchCV()时,您可以在管道中使用 select 最佳缩放器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 了解 sklearn QuantileTransformer - understand sklearn QuantileTransformer sklearn中RepeatedStratifiedKFold和StratifiedKFold的区别 - Differences between RepeatedStratifiedKFold and StratifiedKFold in sklearn mahout中StandardNaiveBayesClassifier和ComplementaryNaiveBayesClassifier之间的差异 - Differences between StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in mahout Apache Spark和Apache Apex有什么区别? - What is the differences between Apache Spark and Apache Apex? 两种算法在FP和FN速率上的差异 - Differences in FP and FN rates between two algorithems sklearn的SimpleImputer和Imputer之间的差异 - Differences between sklearn's SimpleImputer and Imputer 气流和 Kubeflow 管道有什么区别? - What are the differences between airflow and Kubeflow pipeline? 迁移学习和元学习的区别 - Differences between Transfer Learning and Meta Learning 如何在Weka中测试数据集之间的显着差异? - How to test for significant differences between datasets in Weka? 在 R 中显示 2 个栅格之间的异同 - Displaying similarities and differences between 2 rasters in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM