[英]Anomaly detection python
I have a dataset with 7 parameters for each point:我有一个数据集,每个点有 7 个参数:
I would like to find a way to get all the outliers to a python list (not as a plt.show GUI).我想找到一种方法将所有异常值放入 python 列表(而不是 plt.show GUI)。 What algorithm should I use and how can I view the results as a python list?
我应该使用什么算法以及如何以 python 列表的形式查看结果? Thanks for your help :D
感谢您的帮助:D
This page on Medium from Will Badr is a good resource - https://towardsdatascience.com/5-ways-to-detect-outliers-that-every-data-scientist-should-know-python-code-70a54335a623 . Will Badr 在 Medium 上的这个页面是一个很好的资源 - https://towardsdatascience.com/5-ways-to-detect-outliers-that-every-data-scientist-should-know-python-code-70a54335a623 。 In terms of what outlier detection algorithm to use, the answer depends on the distribution of your data.
就使用什么异常值检测算法而言,答案取决于数据的分布。 I have found success using standard deviations and distance from inter-quartile ranges to identify outliers.
我发现使用标准偏差和四分位间距的距离来识别异常值是成功的。 However, these approaches work better over normal distributions, and in my scenario, I found methods to transform my data into a normal distribution without impacting the outcome.
然而,这些方法比正态分布更有效,在我的场景中,我找到了将数据转换为正态分布而不影响结果的方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.