将 R ggplot 中直方图中的 y 轴归一化为按组比例

Question

My question is very similar to Normalizing y-axis in histograms in R ggplot to proportion , except that I have two groups of data of different size, and I would like that each proportion is relative to its group size instead of the total size.我的问题与将R ggplot 中直方图中的 y 轴归一化为比例非常相似，除了我有两组不同大小的数据，我希望每个比例都与其组大小有关，而不是总大小。

To make it clearer, let's say I have two sets of data in a data frame:为了更清楚，假设我在一个数据框中有两组数据：

dataA<-rnorm(100,3,sd=2)
dataB<-rnorm(400,5,sd=3)
all<-data.frame(dataset=c(rep('A',length(dataA)),rep('B',length(dataB))),value=c(dataA,dataB))

I can plot the two distributions together with:我可以将这两个分布与：

ggplot(all,aes(x=value,fill=dataset))+geom_histogram(alpha=0.5,position='identity',binwidth=0.5)

and instead of the frequency on the Y axis I can have the proportion with:而不是 Y 轴上的频率，我可以使用以下比例：

ggplot(all,aes(x=value,fill=dataset))+geom_histogram(aes(y=..count../sum(..count..)),alpha=0.5,position='identity',binwidth=0.5)

But this gives the proportion relative to the total data size (500 points here): is it possible to have it relative to each group size?但这给出了相对于总数据大小的比例（此处为 500 分）：是否有可能相对于每个组大小？

My goal here is to make it possible to compare visually the proportion of values in a given bin between A and B, independently from their respective size.我的目标是可以直观地比较 A 和 B 之间给定 bin 中的值的比例，独立于它们各自的大小。 Ideas which differ from my original one are also welcome!也欢迎与我原来的想法不同的想法！

Thanks!谢谢！

Answer 1

Like this?像这样？ [edited based on OP's comment] [根据OP的评论编辑]

ggplot(all,aes(x=value,fill=dataset))+
  geom_histogram(aes(y=0.5*..density..),
                 alpha=0.5,position='identity',binwidth=0.5)

Using y=..density.. scales the histograms so the area under each is 1, or sum(binwidth*y)=1.使用y=..density..缩放直方图，使每个下的面积为 1，或sum(binwidth*y)=1. As a result, you would use y = binwidth*..density.. to have y represent the fraction of the total in each bin.因此，您将使用y = binwidth*..density..来让 y 表示每个 bin 中总数的分数。 In your case, binwidth=0.5 .在您的情况下， binwidth=0.5 。

IMO this is a little easier to interpret: IMO 这更容易解释：

ggplot(all,aes(x=value,fill=dataset))+
  geom_histogram(aes(y=0.5*..density..),binwidth=0.5)+
  facet_wrap(~dataset,nrow=2)

将 R ggplot 中直方图中的 y 轴归一化为按组比例

问题描述

1 个解决方案

解决方案1
53 已采纳 2014-03-04 20:10:29

将 R ggplot 中直方图中的 y 轴归一化为按组比例

问题描述

1 个解决方案

解决方案1 53 已采纳 2014-03-04 20:10:29

解决方案1
53 已采纳 2014-03-04 20:10:29