简体   繁体   中英

ggplot two histograms in one plot

I created the plot below using:

ggplot(data_all, aes(x = data_all$Speed, fill = data_all$Season)) + 
  theme_bw() +
  geom_histogram(position = "identity", alpha = 0.2, binwidth=0.1)

As you can see, the difference in the amount of data available is very large. Is there a way to look only at the distribution and not at the total data amount?

在此处输入图像描述

There is probably a more elegant way to do this, but one approach would be to use the density() function and render the results with geom_line() .

library(tidyverse)

density1 <- iris %>% 
  filter(Species == "virginica") %>% 
  pull(Sepal.Length) %>% 
  density() 

density2 <- iris %>% 
  filter(Species == "versicolor") %>% 
  pull(Sepal.Length) %>% 
  density()

data_all <- rbind(
  data.frame(x = density1$x, y = density1$y, species = "virginica"),
  data.frame(x = density2$x, y = density2$y, species = "versicolor")
)

ggplot(data_all) +
  aes(x, y, color = species) +
  geom_line() +
  theme_minimal()

在此处输入图像描述

You can reference some of the other calculated values from stat functions using a notation that you may have seen before: ..value.. . I'm not sure the proper name for these or where you can find a list documented, but sometimes these are called "special variables" or "calculated aesthetics".

In this case, the default calculated aesthetic on the y axis for geom_histogram() is ..count.. . When comparing distributions of different total N size, it's useful to use ..density.. . You can access ..density.. by passing it to the y aesthetic directly in the geom_histogram() function.

First, here's an example of two histograms with vastly different sizes (similar to OP's question):

library(ggplot2)

set.seed(8675309)
df <- data.frame(
 x = c(rnorm(1000, -1, 0.5), rnorm(100000, 3, 1)),
 group = c(rep("A", 1000), rep("B", 100000))
)

ggplot(df, aes(x, fill=group)) + theme_classic() +
  geom_histogram(
    alpha=0.2, color='gray80',
    position="identity", bins=80)

在此处输入图像描述

And here's the same plot using ..density.. :

ggplot(df, aes(x, fill=group)) + theme_classic() +
  geom_histogram(
    aes(y=..density..), alpha=0.2, color='gray80',
    position="identity", bins=80)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM