Adding second density plot undoes binning in first histogram plot

Question

I am having some issues adding a second density plot to an existing density + histogram ggplot. Namely, when I add the second density plot from another data source it changes the number of bins in the histogram of the first plot.

Here is the toy data/plot to illustrate my problem

# data
df <- data.frame(var1=rnorm(1e4,0,1), var2=rnorm(1e4,5,1))

# create plot function
plotFunct <- function(data, varName, nBins) {
  p <- ggplot(data, aes_string(x=varName)) + 
              geom_histogram(aes(y=..density..), bins = nBins, fill = "white", colour = "black") +
              geom_density(fill = "#FF6666", alpha = .3)
  return(p)
}

# Now we run the function specifying 40 bins
p <- plotFunct(df, "var1", 40) 
p

So everything is working as it should be.

Next to create the second dataset to add to the first graph...

outsideData <- data.frame(outside = rnorm(1e5, -2, 25))

...and add it to the first plot. This data has a much wider spread so to make the graph more digestible we'll restrict it to a pre-specified range with the coord_cartesian() function

p2 <- p + geom_density(data = outsideData, aes(x=outside), colour = "green") + coord_cartesian(xlim = c(-5,5))

p2

The second density plot is in green. Note that a result of its addition is that the histogram in the first density plot has a single bin instead of the forty bins we specified originally. Somehow the addition of the second density plot has affected the binning of the first. However, the density portion of the original plot seems unaffected.

Can anyone enlighten me how to revert back to the original histogram?

Answer 1

No clue why this is happening, but here is a possible bypass. It turns out that the behavior you describe does not occur when using binwidth instead of bins . So one approach is to pre-calculate a suitable bin width based on the desired number of bins:

library(ggplot2)
library(ggplot2)

# data
df <- data.frame(var1 = rnorm(1e4, 0, 1), var2 = rnorm(1e4, 5, 1))

# create plot function
plotFunct <- function(data, varName, nBins) {
  vn <- as.name(varName)
  cuts <- pretty(data[,varName], nBins)
  binWidth <- abs(cuts[1]-cuts[2])
  cat("using ", nBins, "bins converted into binwidth", binWidth, "\n\n")
  p <- ggplot(data) +
    geom_histogram(
      aes(x = !!vn, y = ..density..),
      binwidth = binWidth,
      fill = "white",
      colour = "black"
    ) +
    geom_density(aes(x=!!vn), fill = "#FF6666", alpha = .3)
  return(p)
}

# Now we run the function specifying 40 bins
p <- plotFunct(df, "var1", 40)
p

outsideData <- data.frame(outside = rnorm(1e5,-2, 25))

p2 <-
  p + geom_density(
    inherit.aes = FALSE,
    data = outsideData,
    aes(x = outside),
    colour = "green"
  ) + coord_cartesian(xlim = c(-5, 5))

p2

Should make this plot:

Adding second density plot undoes binning in first histogram plot

Question

1 answers

solution1
1 ACCPTED 2019-08-27 05:40:04

Adding second density plot undoes binning in first histogram plot

Question

1 answers

solution1 1 ACCPTED 2019-08-27 05:40:04

solution1
1 ACCPTED 2019-08-27 05:40:04