简体   繁体   English

ggplot2 boxplots - 如何在x轴上对因子水平进行分组(并为每个组添加参考线)

[英]ggplot2 boxplots - How to group factors levels on the x-axis (and add reference lines for each group mean)

I have 30 plant species for which I have displayed the distributions of midday leaf water potential ( lwp_md ) using boxplots and the package ggplot2 . 我有30种植物物种,我使用lwp_md图和包ggplot2显示了正午叶水势( lwp_md )的ggplot2 But how do I group these species along the x-axis according to their leaf habits (eg Deciduous , Evergreen ) as well as display a reference line indicating the mean lwp_md value for each leaf habit level? 但是我如何根据它们的叶子习性(例如DeciduousEvergreen )将这些物种沿x轴分组,并显示一条参考线,表明每个叶子习性水平的平均lwp_md值?

I have attempted with the package forcats but really have no idea how to proceed with this one. 我尝试过包裹forcats但实际上不知道如何继续这个。 I can't find anything after an extensive search online. 经过广泛的在线搜索,我找不到任何东西。 The best I seem able to do is order species by some other function eg the median. 我似乎能做的最好的事情是通过其他一些功能来命令物种,例如中位数。

Below is an example of my code so far. 下面是我的代码到目前为止的一个例子。 Note I have used the packages ggplot2 and ggthemes : 注意我使用了包ggplot2ggthemes

library(ggplot2)
ggplot(zz, aes(x=fct_reorder(species, lwp_md, fun=median, .desc=T), y=lwp_md)) +
  geom_boxplot(aes(fill=leaf_habit)) +
  theme_few(base_size=14) +
  theme(legend.position="top", 
        axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
  xlab("Species") +
  ylab("Maximum leaf water potential (MPa)") +
  scale_y_reverse() +
  scale_fill_discrete(name="Leaf habit",
                      breaks=c("DEC", "EG"),
                      labels=c("Deciduous", "Evergreen"))

Here's a subset of my data including 4 of my species (2 deciduous, 2 evergreen): 这是我的数据的一个子集,包括我的4个物种(2个落叶,2个常绿):

> dput(zz)
structure(list(id = 1:20, species = structure(c(1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L
), .Label = c("AMYELE", "BURSIM", "CASXYL", "COLARB"), class = "factor"), 
    leaf_habit = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
    1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("DEC", 
    "EG"), class = "factor"), lwp_md = c(-2.1, -2.5, -2.35, -2.6, 
    -2.45, -1.7, -1.55, -1.4, -1.55, -0.6, -2.6, -3.6, -2.9, 
    -3.1, -3.3, -2, -1.8, -2, -4.9, -5.35)), class = "data.frame", row.names = c(NA, 
-20L))

An example of how I'm looking to display my data, cut and edited - I would like species on x-axis, lwp_md on y-axis: 的如何我希望我的显示数据,剪切和编辑的一个例子-我想species在x轴上, lwp_md在y轴: 图片

gpplot defaults to ordering your factors alphabetically. gpplot默认按字母顺序排序您的因子。 To avoid this you have to supply them as ordered factors. 为避免这种情况,您必须将它们作为有序因子提供。 This can be done by arranging the data.frame and then redeclaring the factors. 这可以通过安排data.frame然后重新声明因子来完成。 To generate the mean value we can use group_by and mutate a new mean column in the df, that can later be plotted. 为了生成平均值,我们可以使用group_by并在df中改变一个新的平均列,以后可以绘制。

Here is the complete code: 这是完整的代码:

library(ggplot)
library(ggthemes)
library(dplyr)

zz2 <- zz %>% arrange(leaf_habit) %>%  group_by(leaf_habit) %>% mutate(mean=mean(lwp_md))
zz2$species <- factor(zz2$species,levels=unique(zz2$species))

ggplot(zz2, aes(x=species, y=lwp_md)) +
  geom_boxplot(aes(fill=leaf_habit)) +
  theme_few(base_size=14) +
  theme(legend.position="top", 
        axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
  xlab("Species") +
  ylab("Maximum leaf water potential (MPa)") +
  scale_y_reverse() +
  scale_fill_discrete(name="Leaf habit",
                      breaks=c("DEC", "EG"),
                      labels=c("Deciduous", "Evergreen")) +
  geom_errorbar(aes(species, ymax = mean, ymin = mean),
                size=0.5, linetype = "longdash", inherit.aes = F, width = 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM