简体   繁体   English

如何将 95% 的置信区间添加到 ggplot 中因子水平的比例图中?

[英]How to add 95% confidence intervals to graph of proportions of factor levels in ggplot?

I wanted to build on the great answer I got to a previously asked question:我想建立在我对之前提出的问题得到的很好答案的基础上:

Graph proportion within a factor level rather than a count in ggplot2 ggplot2 中因子水平内的图表比例而不是计数

I was hoping to build on the code:我希望以代码为基础:

var1 <- c("Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left","Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left")
var2 <- c("Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", NA, "Slightly lower","Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly lower", "Higher", "Higher", "Higher", NA, "Slightly lower")
df <- as.data.frame(cbind(var1, var2))

library(dplyr)
library(ggplot2)

df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(n = n/sum(n)) %>%
  ungroup() %>%
  ggplot() + aes(var2, n, fill = var1) + 
  geom_bar(position = "dodge", stat = "identity") + 
  labs(x="Left or Right",y="Count")+
  scale_y_continuous() +
  scale_fill_discrete(name = "Answer:")+ theme_classic()+ 
  theme(legend.position="top")  +
  scale_fill_manual(values = c("black", "red"))

To add error bars in the form of 95% confidence intervals to each bar on my graph.以 95% 置信区间的形式向图表上的每个条添加误差条。 I have tried to add in the term我试图在术语中添加

upperE=(1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n), lowerE=(-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n).

But alas I keep getting errors...但是,唉,我不断收到错误...

I also tried making an entirely new dataframe for the graph, thus:我还尝试为图表制作一个全新的 dataframe,因此:

var1 <- c("Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left","Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left")
var2 <- c("Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", NA, "Slightly lower","Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly lower", "Higher", "Higher", "Higher", NA, "Slightly lower")
df <- as.data.frame(cbind(var1, var2))



dat <- df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(prop = n/sum(n),upperE=1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n, lowerE=-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n)

test <- ggplot(dat, aes(x=var2, y = prop, fill = var1))+ 
  geom_bar(position = "dodge", stat = "identity") + geom_errorbar(aes(ymin = lowerE, ymax = upperE),position="dodge")+
  labs(x="Answer",y="Proportion")+
  scale_fill_discrete(name = "Condition:")+ theme_classic()+ 
  theme(legend.position="top") 

Which gives me error bars but positioned at 0 on the Y-axis not on top of each bar...这给了我错误条,但在 Y 轴上位于 0,而不是在每个条的顶部......

在此处输入图像描述

Does anyone have any suggestions?有没有人有什么建议? Thank you!谢谢!

I have now worked out how to get the error bars to sit at the appropriate position on each bar - I needed to associate the ymin and ymax specification of the error bar with the values being plotted, thus:我现在已经弄清楚了如何让误差条位于每个条上适当的 position - 我需要将误差条的 ymin 和 ymax 规范与正在绘制的值相关联,因此:

dat <- df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(prop = n/sum(n),upperE=1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n, lowerE=-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n)

test <- ggplot(dat, aes(x=var2, y = prop, fill = var1))+ 
  geom_bar(position = "dodge", stat = "identity") + geom_errorbar(aes(ymin = prop+lowerE, ymax = prop+upperE),width = .2, position=position_dodge(.9))+
  labs(x="Answer",y="Proportion")+
  scale_fill_discrete(name = "Condition:")+ theme_classic()+ 
  theme(legend.position="top") 

Which gave:这给了:

在此处输入图像描述

The formula for the SE in the 95%CI in proportions is: se = sqrt((p * (1-p))/n . So I think in the solution above it is stated: sqrt(n/sum(n) * 1-(n/sum(n))/n) . However, n there is only the count of successes. The full sample is sum(n) . So it actually should be sqrt(n/sum(n) * (1-(n/sum(n))/**sum**(n)) . 95%CI 中 SE 的比例公式为: se = sqrt((p * (1-p))/n 。所以我认为在上面的解决方案中说明了: sqrt(n/sum(n) * 1-(n/sum(n))/n) 。但是, n只有成功的计数。完整的样本是sum(n) 。所以它实际上应该是sqrt(n/sum(n) * (1-(n/sum(n))/**sum**(n))

Super old thread, but just in case somebody still stumbles upon this: the formula for the confidence intervals in the upvoted answer is incorrect.超级旧线程,但以防万一有人仍然偶然发现这一点:已投票答案中的置信区间公式不正确。

It should be:它应该是:

mutate(prop = n/sum(n),
         upperE=1.96*sqrt(n/sum(n)*(1-(n/sum(n)))/sum(n)), 
         lowerE=-1.96*sqrt(n/sum(n)*(1-(n/sum(n)))/sum(n)))

. . With the formula that you used for the confidence intervals, you only take the square root of the first bit of the formula.使用用于置信区间的公式,您只需对公式的第一位求平方根。 However, you need to take the square root of the entire formula (except for the Z score).但是,您需要对整个公式取平方根(Z 分数除外)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM