简体   繁体   English

在R中叠加核分布

[英]Superimposing Kernel Distributions in R

I am trying to place 3 density functions in plot using 我试图在绘图中使用3个密度函数

plot(density(all_noise),xlim=c(-1,1),ylim=c(0,10))
lines(density(max_nearby),col="blue")
lines(density(max_repeats),col="red")

and I got 我得到了 在此输入图像描述

Shouldn't the density value on the y axis be < 1? y轴上的密度值是否应<1? Are there better methods for superimposing kernel distributions? 是否有更好的方法来叠加内核分布?

str(density(all_noise))
List of 7
$ x        : num [1:512] -0.629 -0.626 -0.624 -0.622 -0.62 ...
$ y        : num [1:512] 1.41e-06 8.22e-06 3.16e-05 7.85e-05 1.24e-04 ...
$ bw       : num 0.003
$ n        : int 1924150
$ call     : language density.default(x = all_noise)
$ data.name: chr "all_noise"
$ has.na   : logi FALSE
- attr(*, "class")= chr "density"

str(density(max_nearby))
List of 7
$ x        : num [1:512] 0.154 0.156 0.158 0.16 0.162 ...
$ y        : num [1:512] 0.00111 0.00125 0.0014 0.00157 0.00175 ...
$ bw       : num 0.0543
$ n        : int 250
$ call     : language density.default(x = max_nearby)
$ data.name: chr "max_nearby"
$ has.na   : logi FALSE
- attr(*, "class")= chr "density"

str(density(max_repeats ))
List of 7
$ x        : num [1:512] 0.272 0.274 0.275 0.277 0.279 ...
$ y        : num [1:512] 0.00507 0.00607 0.00722 0.00854 0.01011 ...
$ bw       : num 0.0261
$ n        : int 34
$ call     : language density.default(x = max_repeats)
$ data.name: chr "max_repeats"
$ has.na   : logi FALSE
- attr(*, "class")= chr "density"

The area under the density curves is 1, but they can exceed 1. I see nothing wrong with how you're doing this. 密度曲线下的面积为1,但它们可以超过1.我认为你是怎么做的。 For my own purposes about the only change I'd make would be to initialize the plot window with values so that all densities are in the bounds of the plot window. 出于我自己的目的,我所做的唯一改变是用值初始化绘图窗口,以便所有密度都在绘图窗口的边界内。

Also, regarding the previous answer (I can't comment yet) notice that ylim is an argument to plot() , not to density() --- it's not telling density() to do anything. 另外,关于先前的答案(我还不能评论),请注意ylimplot()的参数,而不是density() ---它不会告诉density()做任何事情。

kernel density plot is not a histogram. 核密度图不是直方图。 here is an example: take a look at the min and max of the density function and real min max of the data. 这是一个例子:看一下密度函数的最小值和最大值以及数据的实际最小值。

x <-rnorm(100)
min(x)
[1] -2.748188
max(x)
[1] 3.689254
density(x)
Call:
density.default(x = x)
Data: x (100 obs.); Bandwidth 'bw' = 0.4114

       x                 y            
 Min.   :-3.9823   Min.   :0.0001091  
 1st Qu.:-1.7559   1st Qu.:0.0079287  
 Median : 0.4705   Median :0.0612352  
 Mean   : 0.4705   Mean   :0.1121754  
 3rd Qu.: 2.6969   3rd Qu.:0.2267729  
 Max.   : 4.9234   Max.   :0.3439259 

plot(density(x))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM