简体   繁体   English

添加第二个密度图会取消第一个直方图中的分箱

[英]Adding second density plot undoes binning in first histogram plot

I am having some issues adding a second density plot to an existing density + histogram ggplot.我在将第二个密度图添加到现有密度 + 直方图 ggplot 时遇到了一些问题。 Namely, when I add the second density plot from another data source it changes the number of bins in the histogram of the first plot.也就是说,当我从另一个数据源添加第二个密度图时,它会改变第一个图的直方图中的 bin 数量。

Here is the toy data/plot to illustrate my problem这是玩具数据/情节来说明我的问题

# data
df <- data.frame(var1=rnorm(1e4,0,1), var2=rnorm(1e4,5,1))

# create plot function
plotFunct <- function(data, varName, nBins) {
  p <- ggplot(data, aes_string(x=varName)) + 
              geom_histogram(aes(y=..density..), bins = nBins, fill = "white", colour = "black") +
              geom_density(fill = "#FF6666", alpha = .3)
  return(p)
}

# Now we run the function specifying 40 bins
p <- plotFunct(df, "var1", 40) 
p

在此处输入图片说明

So everything is working as it should be.所以一切正常。

Next to create the second dataset to add to the first graph...接下来创建要添加到第一个图形的第二个数据集...

outsideData <- data.frame(outside = rnorm(1e5, -2, 25))

...and add it to the first plot. ...并将其添加到第一个图中。 This data has a much wider spread so to make the graph more digestible we'll restrict it to a pre-specified range with the coord_cartesian() function此数据具有更广泛的传播范围,因此为了使图形更易于理解,我们将使用coord_cartesian()函数将其限制在预先指定的范围内

p2 <- p + geom_density(data = outsideData, aes(x=outside), colour = "green") + coord_cartesian(xlim = c(-5,5))

p2

在此处输入图片说明

The second density plot is in green.第二个密度图是绿色的。 Note that a result of its addition is that the histogram in the first density plot has a single bin instead of the forty bins we specified originally.请注意,添加的结果是第一个密度图中的直方图只有一个 bin,而不是我们最初指定的 40 个 bin。 Somehow the addition of the second density plot has affected the binning of the first.不知何故,第二个密度图的添加影响了第一个的分箱。 However, the density portion of the original plot seems unaffected.但是,原始图的密度部分似乎不受影响。

Can anyone enlighten me how to revert back to the original histogram?谁能启发我如何恢复到原始直方图?

No clue why this is happening, but here is a possible bypass.不知道为什么会发生这种情况,但这里有一个可能的旁路。 It turns out that the behavior you describe does not occur when using binwidth instead of bins .事实证明,使用binwidth而不是bins时不会发生您描述的行为。 So one approach is to pre-calculate a suitable bin width based on the desired number of bins:因此,一种方法是根据所需的 bin 数量预先计算合适的 bin 宽度:

library(ggplot2)
library(ggplot2)

# data
df <- data.frame(var1 = rnorm(1e4, 0, 1), var2 = rnorm(1e4, 5, 1))

# create plot function
plotFunct <- function(data, varName, nBins) {
  vn <- as.name(varName)
  cuts <- pretty(data[,varName], nBins)
  binWidth <- abs(cuts[1]-cuts[2])
  cat("using ", nBins, "bins converted into binwidth", binWidth, "\n\n")
  p <- ggplot(data) +
    geom_histogram(
      aes(x = !!vn, y = ..density..),
      binwidth = binWidth,
      fill = "white",
      colour = "black"
    ) +
    geom_density(aes(x=!!vn), fill = "#FF6666", alpha = .3)
  return(p)
}

# Now we run the function specifying 40 bins
p <- plotFunct(df, "var1", 40)
p

outsideData <- data.frame(outside = rnorm(1e5,-2, 25))

p2 <-
  p + geom_density(
    inherit.aes = FALSE,
    data = outsideData,
    aes(x = outside),
    colour = "green"
  ) + coord_cartesian(xlim = c(-5, 5))

p2

Should make this plot:应该制作这个情节:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM