简体   繁体   English

Python 中的 MATLAB ks密度等价物

[英]MATLAB ksdensity equivalent in Python

I've looked online and have yet to find an answer or way to figure the following我在网上看过,还没有找到答案或方法来计算以下内容

I'm translating some MATLAB code to Python where in MATLAB im looking to find the kernel density estimation with the function:我正在将一些 MATLAB 代码转换为 Python,其中在 MATLAB 中我希望找到具有以下功能的内核密度估计:

[p,x] = ksdensity(data)

where p is the probability at point x in the distribution.其中 p 是分布中 x 点的概率。

Scipy has a function but only returns p. Scipy 有一个函数,但只返回 p。

Is there a way to find the probability at values of x?有没有办法找到 x 值的概率?

Thanks!谢谢!

That form of the ksdensity call automatically generates an arbitrary x .这种形式的ksdensity调用会自动生成一个任意的x scipy.stats.gaussian_kde() returns a callable function that can be evaluated with any x of your choosing. scipy.stats.gaussian_kde()返回一个可调用的函数,可以使用您选择的任何x进行评估。 The equivalent x would be np.linspace(data.min(), data.max(), 100) .等效的x将是np.linspace(data.min(), data.max(), 100)

import numpy as np
from scipy import stats

data = ...
kde = stats.gaussian_kde(data)
x = np.linspace(data.min(), data.max(), 100)
p = kde(x)

Another option is the kernel density estimator in the Scikit-Learn Python package, sklearn.neighbors.KernelDensity另一种选择是 Scikit-Learn Python 包中的内核密度估计器sklearn.neighbors.KernelDensity

Here is a little example similar to the Matlab documentation for ksdensity for a Gaussian distribution:这是一个类似于高斯分布的 ks密度的 Matlab 文档的小例子:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KernelDensity

np.random.seed(12345)
# similar to MATLAB ksdensity example x = [randn(30,1); 5+randn(30,1)];
Vecvalues=np.concatenate((np.random.normal(0,1,30), np.random.normal(5,1,30)))[:,None]
Vecpoints=np.linspace(-8,12,100)[:,None]
kde = KernelDensity(kernel='gaussian', bandwidth=0.5).fit(Vecvalues)
logkde = kde.score_samples(Vecpoints)
plt.plot(Vecpoints,np.exp(logkde))
plt.show()

The plot this produces looks like:这产生的情节看起来像:

在此处输入图片说明

Matlab is orders of magnitude faster than KernelDensity when it comes to finding the optimal bandwidth.在寻找最佳带宽方面,Matlab 比 KernelDensity 快几个数量级。 Any idea of how to make the KernelDenisty faster?知道如何使 KernelDenisty 更快吗? – Yuca Jul 16 '18 at 20:58 – 尤卡 18 年 7 月 16 日,20:58

Hi, Yuca.嗨,尤卡。 The matlab use Scott rule to estimate the bandwidth, which is fast but requires the data from the normal distribution. matlab 使用Scott 规则估计带宽,速度快但需要正态分布的数据。 For more information, please see this Post .有关更多信息,请参阅此帖子

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM