简体   繁体   中英

Adding second density plot undoes binning in first histogram plot

I am having some issues adding a second density plot to an existing density + histogram ggplot. Namely, when I add the second density plot from another data source it changes the number of bins in the histogram of the first plot.

Here is the toy data/plot to illustrate my problem

# data
df <- data.frame(var1=rnorm(1e4,0,1), var2=rnorm(1e4,5,1))

# create plot function
plotFunct <- function(data, varName, nBins) {
  p <- ggplot(data, aes_string(x=varName)) + 
              geom_histogram(aes(y=..density..), bins = nBins, fill = "white", colour = "black") +
              geom_density(fill = "#FF6666", alpha = .3)
  return(p)
}

# Now we run the function specifying 40 bins
p <- plotFunct(df, "var1", 40) 
p

在此处输入图片说明

So everything is working as it should be.

Next to create the second dataset to add to the first graph...

outsideData <- data.frame(outside = rnorm(1e5, -2, 25))

...and add it to the first plot. This data has a much wider spread so to make the graph more digestible we'll restrict it to a pre-specified range with the coord_cartesian() function

p2 <- p + geom_density(data = outsideData, aes(x=outside), colour = "green") + coord_cartesian(xlim = c(-5,5))

p2

在此处输入图片说明

The second density plot is in green. Note that a result of its addition is that the histogram in the first density plot has a single bin instead of the forty bins we specified originally. Somehow the addition of the second density plot has affected the binning of the first. However, the density portion of the original plot seems unaffected.

Can anyone enlighten me how to revert back to the original histogram?

No clue why this is happening, but here is a possible bypass. It turns out that the behavior you describe does not occur when using binwidth instead of bins . So one approach is to pre-calculate a suitable bin width based on the desired number of bins:

library(ggplot2)
library(ggplot2)

# data
df <- data.frame(var1 = rnorm(1e4, 0, 1), var2 = rnorm(1e4, 5, 1))

# create plot function
plotFunct <- function(data, varName, nBins) {
  vn <- as.name(varName)
  cuts <- pretty(data[,varName], nBins)
  binWidth <- abs(cuts[1]-cuts[2])
  cat("using ", nBins, "bins converted into binwidth", binWidth, "\n\n")
  p <- ggplot(data) +
    geom_histogram(
      aes(x = !!vn, y = ..density..),
      binwidth = binWidth,
      fill = "white",
      colour = "black"
    ) +
    geom_density(aes(x=!!vn), fill = "#FF6666", alpha = .3)
  return(p)
}

# Now we run the function specifying 40 bins
p <- plotFunct(df, "var1", 40)
p

outsideData <- data.frame(outside = rnorm(1e5,-2, 25))

p2 <-
  p + geom_density(
    inherit.aes = FALSE,
    data = outsideData,
    aes(x = outside),
    colour = "green"
  ) + coord_cartesian(xlim = c(-5, 5))

p2

Should make this plot:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM