[英]Adding error bars to ggplot2 bar plot after group by in dplyr
I have the following data in R.我在 R 中有以下数据。
oligo condition score
REF Sample 27.827
REF Sample 24.622
REF Sample 31.042
REF Competitor 21.066
REF Competitor 18.413
REF Competitor 36.164
ALT Sample 75.465
ALT Sample 57.058
ALT Sample 66.408
ALT Competitor 35.420
ALT Competitor 17.652
ALT Competitor 21.466
I have munged this and taken the averages of the scores for each condition using the group_by
and summarise
functions in dplyr.我已经对此进行了修改,并使用 dplyr 中的
group_by
和summarise
函数计算了每个条件的分数的平均值。
emsa_test <- emsa_1 %>%
group_by(oligo,condition) %>%
summarise_all(mean)
Creating the this table.创建此表。
oligo condition score
ALT Competitor 24.84600
ALT Sample 66.31033
REF Competitor 25.21433
REF Sample 27.83033
I then plotted this using ggplot2.然后我使用 ggplot2 绘制了这个图。
ggplot(emsa_test, aes(oligo, score)) +
geom_bar(aes(fill = condition),
width = 0.4, position = position_dodge(width=0.5), color = "black", stat="identity", size=.3) +
theme_bw() +
ggtitle("CEBP\u03b1") +
theme(plot.title = element_text(size = 40, face = "bold", hjust = 0.5)) +
scale_fill_manual(values = c("#d8b365", "#f5f5f5"))
My issue is that I need to add error bars to the plot.我的问题是我需要在图中添加误差线。 The implementation would be similar to this.
实现将与此类似。
geom_errorbar(aes(ymin=len-se, ymax=len+se), width=.1, position=pd)
However the after the data is munged, the max and min info contained in table 1 is lost.然而,在数据被修改后,表 1 中包含的最大值和最小值信息将丢失。 I could add the error bars manually but I have a few plots to plot so wonder if there is a way to retain this info through the pipeline.
我可以手动添加误差线,但我有一些要绘制的图,所以想知道是否有办法通过管道保留这些信息。
Many Thanks.非常感谢。
You can calculate the components on the fly with dplyr
like this: 您可以使用
dplyr
即时计算组件:
library(tidyverse)
df <- read_table(
"oligo condition score
REF Sample 27.827
REF Sample 24.622
REF Sample 31.042
REF Competitor 21.066
REF Competitor 18.413
REF Competitor 36.164
ALT Sample 75.465
ALT Sample 57.058
ALT Sample 66.408
ALT Competitor 35.420
ALT Competitor 17.652
ALT Competitor 21.466"
)
df %>%
group_by(oligo, condition) %>%
summarise(
mean = mean(score),
sd = sd(score),
n = n(),
se = sd / n
) %>%
ggplot(aes(x = oligo, y = mean, fill = condition)) +
geom_col(position = position_dodge()) +
geom_errorbar(
aes(ymin = mean - se, ymax = mean + se),
position = position_dodge2(padding = 0.5)
) +
labs(
title = "Mean Score ± 1 SE"
)
Created on 2019-04-01 by the reprex package (v0.2.1) 由reprex软件包 (v0.2.1)创建于2019-04-01
You can summarize to more than one value and preserve min
max
and mean
: 您可以汇总多个值并保留
min
max
和mean
:
emsa_test <- emsa_1 %>%
group_by(oligo,condition) %>%
summarise(mean=mean(score),min=min(score),max=max(score))
没有足够的声誉来评论,但只是注意到 JasonAizkalns 的回答中的一个错误,以防其他人简单地复制代码:se = sd/sqrt(n)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.