简体   繁体   English

kernel R 中的密度估计器

[英]kernel density estimator in R

I'm using the last column from the following data,Data我正在使用以下数据的最后一列,数据

And I'm trying to apply the idea of a kernel density estimator to this dataset which is represented by我正在尝试将 kernel 密度估计器的想法应用于此数据集,该数据集由核密度的方程式设置

where k is some kernal, normally a normal distribution though not necessarily., h is the bandwidth, n is the length of the data set, X_i is each data point and x is a fitted value.其中 k 是一些核,通常是正态分布,但不一定。h 是带宽,n 是数据集的长度,X_i 是每个数据点,x 是拟合值。 So using this equation I have the following code,所以使用这个等式我有以下代码,

AstroData=read.table(paste0("http://www.stat.cmu.edu/%7Elarry",
                   "/all-of-nonpar/=data/galaxy.dat"),
            header=FALSE)
x=AstroData$V3
xsorted=sort(x)
x_i=xsorted[1:1266]
hist(x_i, nclass=308)
n=length(x_i)
h1=.002
t=seq(min(x_i),max(x_i),0.01)
M=length(t)
fhat1=rep(0,M)
for (i in 1:M){
 fhat1[i]=sum(dnorm((t[i]-x_i)/h1))/(n*h1)}
lines(t, fhat1, lwd=2, col="red")

Which produces a the following plot,产生以下 plot, 结果图

which is actually close to what I want as the final result should appear as this once I remove the histograms,这实际上接近我想要的,因为一旦我删除直方图,最终结果应该显示为这样, 最终情节

Which if you noticed is finer tuned and the red lines which should represent the density are rather rough and are not scaled as high.如果您注意到它经过了更精细的调整,并且应该代表密度的红线相当粗糙并且没有缩放得那么高。 The final plot that you see is run using the density function in R,最后看到的plot是使用R中的密度function运行的,

plot(density(x=y, bw=.002))

Which is what I want to get to without having to use any additional packages.这是我想在不使用任何额外包的情况下达到的目的。

Thank you谢谢

After some talk with my roommate he gave me the idea to go ahead and decrease the interval of the t-values (x).在与我的室友交谈后,他给了我提前 go 的想法,并减少 t 值 (x) 的间隔。 In doing some I changed it from 0.01 to 0.001.在做一些事情时,我将它从 0.01 更改为 0.001。 So the final code for this plot is as appears,所以这个 plot 的最终代码如下所示,

AstroData=read.table(paste0("http://www.stat.cmu.edu/%7Elarry",
               "/all-of-nonpar/=data/galaxy.dat"),
        header=FALSE)
x=AstroData$V3
xsorted=sort(x)
x_i=xsorted[1:1266]
hist(x_i, nclass=308)
n=length(x_i)
h1=.002
t=seq(min(x_i),max(x_i),0.001)
M=length(t)
fhat1=rep(0,M)
for (i in 1:M){
fhat1[i]=sum(dnorm((t[i]-x_i)/h1))/(n*h1)}
lines(t, fhat1, lwd=2, col="blue")

Which in terms gives the following plot, which is the one that I wanted,换句话说,它给出了以下 plot,这是我想要的,

正确的情节

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM