简体   繁体   English

用geom_bar和stat =“ identity”在平均值处绘制hline

[英]Plot hline at mean with geom_bar and stat=“identity”

I have a barplot where the exact bar heights are in the dataframe. 我有一个条形图,其中确切的条形高度在数据框中。

df <- data.frame(x=LETTERS[1:6], y=c(1:6, 1:6 + 1), g=rep(x = c("a", "b"), each=6))

ggplot(df, aes(x=x, y=y, fill=g, group=g)) + 
  geom_bar(stat="identity", position="dodge")

在此处输入图片说明

Now I want to add two hlines displaying the mean of all bars per group. 现在,我要添加两个 hline,以显示每组所有条形的平均值。 All I get with 我所拥有的

ggplot(df, aes(x=x, y=y, fill=g, group=g)) + 
  geom_bar(stat="identity", position="dodge") +
  stat_summary(fun.y=mean, aes(yintercept=..y.., group=g), geom="hline")

is

在此处输入图片说明

As I want to do this for a arbitrary number of groups as well, I would appreciate a solution with ggplot only. 因为我也想对任意数量的组执行此操作,所以我只希望使用ggplot解决方案。

I want to avoid a solution like this, because it does not rely purely on the dataset passed to ggplot, has redundant code and is not flexible in the number of groups: 我想避免这样的解决方案,因为它不完全依赖传递给ggplot的数据集,具有冗余代码并且在组数方面不灵活:

ggplot(df, aes(x=x, y=y, fill=g, group=g)) + 
  geom_bar(stat="identity", position="dodge") +
  geom_hline(yintercept=mean(df$y[df$g=="a"]), col="red") +
  geom_hline(yintercept=mean(df$y[df$g=="b"]), col="green")

Thanks in advance! 提前致谢!

Edits: 编辑:

  • added dataset 添加数据集
  • comment on resulting code 评论结果代码
  • changed the data and plots to clarify the question 更改数据和绘图以澄清问题

If I understand your question correctly, your first approach is almost there: 如果我正确理解您的问题,那么您的第一种方法就差不多了:

ggplot(df, aes(x = x, y = y, fill = g, group = g)) + 
  geom_col(position="dodge") + # geom_col is equivalent to geom_bar(stat = "identity")
  stat_summary(fun.y = mean, aes(x = 1, yintercept = ..y.., group = g), geom = "hline")

情节

According to the help file for stat_summary : 根据stat_summary的帮助文件:

stat_summary operates on unique x; stat_summary对唯一的x进行操作; ... ...

In this case, stat_summary has inherited the top level aesthetic mappings of x = x and group = g by default, so it would calculate the mean y value at each x for each value of g, resulting in a lot of horizontal lines. 在这种情况下, stat_summary继承了x = xgroup = g的顶级美学映射,因此它将为g的每个值计算每个x的平均y值,从而导致许多水平线。 Adding x = 1 to stat_summary 's mapping overrides x = x (while retaining group = g ), so we get a single mean y value for each value of g instead. stat_summary的映射中添加x = 1会覆盖x = x (同时保留group = g ),因此对于g的每个值,我们得到一个均值y值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM