简体   繁体   中英

Gaussian kernel density estimation with fixed covariance (with python)

I can perform a Gaussian kernel density estimation using scipy library by simply running

from scipy import stats
kernel = stats.gaussian_kde(data)

but I would like to fix the covariance to some predefined value and perform KDE with it. Is there a simple way to achieve this with help of python without explicitly writing the optimization procedure (which I will do if there is no existing library offering such functionality, yet I wish to avoid it).

From my comments:

Generally, for density estimation, the gaussian involved serves as a "window" function, and the "covariance" of that window (effectively the bandwidth parameter in a 1-D case) is just meant to control how the window's response falls off as a function of distance for the point-under-test. I am not familiar with any KDE procedure that seeks to use a specific multivariate covariance structure for this window-fall-off effect.

I would also guess that the most complicated such 'covariance' that would be advisable in practice would be a diagonal matrix where you just used a different bandwidth parameter for each dimension of the data. Maybe (and it's a very tenuous maybe) you could do some kind of PCA breakdown of the principle directions of your data and put the different bandwidths there, but I think it's highly unlikely this will payoff unless the data directions have wildly different scales, in which case you'd be better off just scoring your inputs before doing the KDE in the first place, and use one bandwidth.

If you read the KDE examples from scikits.learn, and the documentation for their KernelDensity class , it also seems that (like SciPy) they just offer you a bandwidth feature (a single floating point number) to summarize the way the kernel's repsonse should fall off.

To me this suggests it's not of much practical interest to have a lot of control over multivariate bandwidth settings. Best bet is to performs some scoring or standardization to transform your input variables in a way that makes them all of the same scale (so that smoothing in every direction with the same scale is appropriate) and then use the KDE to predict or classify values in that transformed space, and apply inverse transformations to each coordinate if you want to go back to the original scaled space.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM