[英]ggplot2 boxplots - How to group factors levels on the x-axis (and add reference lines for each group mean)
I have 30 plant species for which I have displayed the distributions of midday leaf water potential ( lwp_md
) using boxplots and the package ggplot2
. 我有30种植物物种,我使用
lwp_md
图和包ggplot2
显示了正午叶水势( lwp_md
)的ggplot2
。 But how do I group these species along the x-axis according to their leaf habits (eg Deciduous
, Evergreen
) as well as display a reference line indicating the mean lwp_md
value for each leaf habit level? 但是我如何根据它们的叶子习性(例如
Deciduous
, Evergreen
)将这些物种沿x轴分组,并显示一条参考线,表明每个叶子习性水平的平均lwp_md
值?
I have attempted with the package forcats
but really have no idea how to proceed with this one. 我尝试过包裹
forcats
但实际上不知道如何继续这个。 I can't find anything after an extensive search online. 经过广泛的在线搜索,我找不到任何东西。 The best I seem able to do is order species by some other function eg the median.
我似乎能做的最好的事情是通过其他一些功能来命令物种,例如中位数。
Below is an example of my code so far. 下面是我的代码到目前为止的一个例子。 Note I have used the packages
ggplot2
and ggthemes
: 注意我使用了包
ggplot2
和ggthemes
:
library(ggplot2)
ggplot(zz, aes(x=fct_reorder(species, lwp_md, fun=median, .desc=T), y=lwp_md)) +
geom_boxplot(aes(fill=leaf_habit)) +
theme_few(base_size=14) +
theme(legend.position="top",
axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
xlab("Species") +
ylab("Maximum leaf water potential (MPa)") +
scale_y_reverse() +
scale_fill_discrete(name="Leaf habit",
breaks=c("DEC", "EG"),
labels=c("Deciduous", "Evergreen"))
Here's a subset of my data including 4 of my species (2 deciduous, 2 evergreen): 这是我的数据的一个子集,包括我的4个物种(2个落叶,2个常绿):
> dput(zz)
structure(list(id = 1:20, species = structure(c(1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L
), .Label = c("AMYELE", "BURSIM", "CASXYL", "COLARB"), class = "factor"),
leaf_habit = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("DEC",
"EG"), class = "factor"), lwp_md = c(-2.1, -2.5, -2.35, -2.6,
-2.45, -1.7, -1.55, -1.4, -1.55, -0.6, -2.6, -3.6, -2.9,
-3.1, -3.3, -2, -1.8, -2, -4.9, -5.35)), class = "data.frame", row.names = c(NA,
-20L))
An example of how I'm looking to display my data, cut and edited - I would like species
on x-axis, lwp_md
on y-axis: 的如何我希望我的显示数据,剪切和编辑的一个例子-我想
species
在x轴上, lwp_md
在y轴:
gpplot
defaults to ordering your factors alphabetically. gpplot
默认按字母顺序排序您的因子。 To avoid this you have to supply them as ordered factors. 为避免这种情况,您必须将它们作为有序因子提供。 This can be done by arranging the
data.frame
and then redeclaring the factors. 这可以通过安排
data.frame
然后重新声明因子来完成。 To generate the mean value we can use group_by
and mutate a new mean column in the df, that can later be plotted. 为了生成平均值,我们可以使用
group_by
并在df中改变一个新的平均列,以后可以绘制。
Here is the complete code: 这是完整的代码:
library(ggplot)
library(ggthemes)
library(dplyr)
zz2 <- zz %>% arrange(leaf_habit) %>% group_by(leaf_habit) %>% mutate(mean=mean(lwp_md))
zz2$species <- factor(zz2$species,levels=unique(zz2$species))
ggplot(zz2, aes(x=species, y=lwp_md)) +
geom_boxplot(aes(fill=leaf_habit)) +
theme_few(base_size=14) +
theme(legend.position="top",
axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
xlab("Species") +
ylab("Maximum leaf water potential (MPa)") +
scale_y_reverse() +
scale_fill_discrete(name="Leaf habit",
breaks=c("DEC", "EG"),
labels=c("Deciduous", "Evergreen")) +
geom_errorbar(aes(species, ymax = mean, ymin = mean),
size=0.5, linetype = "longdash", inherit.aes = F, width = 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.