简体   繁体   中英

ggplot2 density plotting different size of data in R

I have two data sets, their size is 500 and 1000. I want to plot density for these two data sets in one plot.
I have done some search in google.

the data sets in above threads are the same

df <- data.frame(x = rnorm(1000, 0, 1), y = rnorm(1000, 0, 2), z = rnorm(1000, 2, 1.5))

But if I have different data size, I should normalize the data first in order to compare the density between data sets.

Is it possible to make density plot with different data size in ggplot2?

By default, all densities are scaled to unit area. If you have two datasets with different amounts of data, you can plot them together like so:

df1 <- data.frame(x = rnorm(1000, 0, 2))
df2 <- data.frame(y = rnorm(500, 1, 1))

ggplot() + 
  geom_density(data = df1, aes(x = x), 
               fill = "#E69F00", color = "black", alpha = 0.7) + 
  geom_density(data = df2, aes(x = y),
               fill = "#56B4E9", color = "black", alpha = 0.7)

在此输入图像描述

However, from your latest comment, I take that that's not what you want. Instead, you want the areas under the density curves to be scaled relative to the amount of data in each group. You can do that with the ..count.. aesthetics:

df1 <- data.frame(x = rnorm(1000, 0, 2), label=rep('df1', 1000))
df2 <- data.frame(x = rnorm(500, 1, 1), label=rep('df2', 500))
df=rbind(df1, df2)

ggplot(df, aes(x, y=..count.., fill=label)) + 
  geom_density(color = "black", alpha = 0.7) + 
  scale_fill_manual(values = c("#E69F00", "#56B4E9"))

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM