简体   繁体   English

多组的ggplot密度图

[英]ggplot density plot for multiple groups

Using R ggplot to plot density plot for multiple plot.使用 R ggplot 绘制多个图的密度图。

Using the following data.frame:使用以下 data.frame:

set.seed(1234)
df <- data.frame(
  sex=factor(rep(c("F", "M"), each=5)),
  weight=round(c(rnorm(5, mean=0, sd=0),
                 rnorm(5, mean=2, sd=5)))
)

Let's first plot only the female group:让我们先只绘制女性组:

library(dplyr)
ggplot(df %>% filter(sex=="F"), aes(x=weight, color=sex)) + geom_density()

Women only density plot女性专用密度图

在此处输入图片说明

But, if we try to plot both men and women:但是,如果我们试图同时绘制男性和女性:

ggplot(df, aes(x=weight, color=sex)) + geom_density()

density plot for both women and men女性和男性的密度图

在此处输入图片说明

We get a completely different density plot for the women我们得到了一个完全不同的女性密度图

I assumed that the density is being calculated per population.我假设密度是按人口计算的。 So, adding a different population (men in this case) shouldn't change the women density.因此,添加不同的人口(在本例中为男性)不应改变女性密度。

All the women have a weight of 0, so the from and to in density() are both 0, which is why you get a vertical line.所有女性的权重都是 0,所以density()中的fromto都是 0,这就是为什么你会得到一条垂直线。 When the men are added, you get a different from and to (-10 and 7, the range of weight now), and then it does a density estimation with a bandwidth determined by the nrd0 algorithm.当加入的男性,则得到不同的fromto (-10和7的范围weight现在),然后它与带宽的密度估计来确定由nrd0算法。 (See ?bw.nrd0 ; in this case it's about 4 for men and 0.65 for women.) The smoothing (gaussian by default) creates the peaked shape. (请参阅?bw.nrd0 ;在这种情况下,男性约为 4,女性约为 0.65。)平滑(默认为高斯)创建峰形。

To get a better idea of what's going on, try some other arguments for the parameters of density() , eg为了更好地了解发生了什么,请尝试一些其他参数作为density()的参数,例如

ggplot(df, aes(x=weight, color=sex)) + geom_density(kernel = 'triangular', bw = 0.5)

具有较长带宽的三角形

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM