简体   繁体   English

计算和绘制 r 中核密度分布的差异

[英]Calculating and plotting differences in kernel density distributions in r

I'm using R and I want to calculate the differences between two Kernel density distributions at each point on the x axis and plot that difference but am having some trouble.我正在使用 R 并且我想计算 x 轴上每个点的两个核密度分布之间的差异并绘制该差异,但我遇到了一些问题。 Is there a certain function or way that I can do this?是否有某种功能或方法可以做到这一点? For context, I'm using blood pressure data and I want to calculate the differences at each point in the blood pressures between men and women.就上下文而言,我正在使用血压数据,我想计算男性和女性血压每个点的差异。

My code for the distributions (not the differences) looks something like this (SBP=systolic blood pressure):我的分布代码(不是差异)看起来像这样(SBP=收缩压):

km <- density(data$SBP[data$GENDER==0], bw="nrd0", adjust = 1, kernel = c("gaussian"), window = kernel, n=512, cut=3, give.Rkern = FALSE, na.rm=FALSE)
kf <- density(data$SBP[data$GENDER==1], bw="nrd0", adjust = 1, kernel = c("gaussian"), window = kernel, n=512, cut=3, give.Rkern = FALSE, na.rm=FALSE)

plot(km, xlab="SBP", main="SBP Distribution of Men & Women", col="blue")
lines(kf, col="green")

I am completely new to all this!我对这一切完全陌生! I'm pretty sure my exact question has also not been asked here but please lead me to any other resources that may help.我很确定我的确切问题也没有在这里被问到,但请引导我找到可能有帮助的任何其他资源。 Thanks.谢谢。

The density objects have elements x and y elements that store the x-axis and distribution function values repectively. density对象具有分别存储 x 轴和分布函数值的元素xy元素。 If you use the same from and to arguements for both density() calls, then the x values calculated should be the same.如果对两个density()调用使用相同的fromto参数,则计算出的x值应该相同。

Store the xy values for both densities in a dataframe an d then merge/join them on x , then you can calculate the difference and plot them:将两个密度的 xy 值存储在一个数据框中,然后在x上合并/加入它们,然后您可以计算差异并绘制它们:

x <- rnorm(1000,0,1)
y <- rnorm(1000,1,1)
fx <- density(x,from = -5,to=5)
fy <- density(y,from = -5,to=5)
plot(fx,col='blue',main="SBP Distribution of Men & Women")
lines(fy, col="green")

dfx <- data.frame(x=fx$x,
                  fx=fx$y)

dfy <- data.frame(x=fy$x,
                  fy=fy$y)

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)

inner_join(dfx,dfy,on='x') %>% 
  mutate(diff=fx-fy) %>% 
  ggplot()+
  geom_line(aes(x=x,y=diff))
#> Joining, by = "x"

Created on 2020-03-10 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 3 月 10 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM