简体   繁体   中英

How to create a kernel density estimation with R?

I would like to program a kernel estimate (with Epanechnikov kernel^1 for example). I tried the following code^2 by putting the manual code (blue) and the default code (red) on the same figure (see attached) but it always gives a difference between the two density curves!


1: The analytic form of the Epanechnikov kernel is: kappa(u) = (1-u^2), support |u| <=1, with u = (x-x_{i})/h.


2: My trial code:

x= faithful$eruptions

fit2 <- density(x, bw = 0.6, kernel = "epanechnikov")

xgrid = seq(-1, 8, 0.1)

kernelEpan <- function(x, obs, h) sum((1-((x-obs)/h)^2)*(abs(x-obs)<=h))/h

plot(xgrid, sapply(xgrid, FUN = kernelEpan, obs = faithful$eruptions, h = 0.6)/length(faithful$eruptions), type = "l", col = "blue")

lines(fit2, col = "red")

在此处输入图像描述

If you read the docs for bw in the density function, you will see:

bw : the smoothing bandwidth to be used. The kernels are scaled such that this is the standard deviation of the smoothing kernel.

Which means that in order for your function's h parameter to match the behaviour of the bw parameter, you will need to rescale the h parameter by multiplying it by sqrt(5) .

I would be tempted to vectorize your function, which allows you to normalize it accurately too:

kernelEpan <- function(xvals, obs, h) {
  
  h <- h * sqrt(5)
  
  dens <- sapply(xvals, function(x) {
    u <- abs(x - obs) / h
    u <- ifelse(u > 1, 1, u)
    sum(1 - u^2)
  }) 
  
  dens / sum(dens * mean(diff(xvals)))
}

This allows:

fit1 <- kernelEpan(xgrid, obs = faithful$eruptions, h = 0.6)

fit2 <- density(x, bw = 0.6, kernel = "epanechnikov")

plot(xgrid, fit1, type = "l", col = "blue")

lines(fit2, col = "red")

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM