[英]How can I overlay by-group plot elements to ggplot2 facets?
My question has to do with facetting.我的问题与刻面有关。 In my example code below, I look at some facetted scatterplots, then try to overlay information (in this case, mean lines) on a per-facet basis.
在下面的示例代码中,我查看了一些分面散点图,然后尝试在每个方面叠加信息(在本例中为平均线)。
The tl;dr version is that my attempts fail. tl;dr 版本是我的尝试失败了。 Either my added mean lines compute across all data (disrespecting the facet variable), or I try to write a formula and R throws an error, followed by incisive and particularly disparaging comments about my mother.
要么我添加的平均线计算所有数据(不尊重 facet 变量),要么我尝试编写一个公式并且 R 抛出一个错误,然后是对我母亲的尖锐和特别贬低的评论。
library(ggplot2)
# Let's pretend we're exploring the relationship between a car's weight and its
# horsepower, using some sample data
p <- ggplot()
p <- p + geom_point(aes(x = wt, y = hp), data = mtcars)
print(p)
# Hmm. A quick check of the data reveals that car weights can differ wildly, by almost
# a thousand pounds.
head(mtcars)
# Does the difference matter? It might, especially if most 8-cylinder cars are heavy,
# and most 4-cylinder cars are light. ColorBrewer to the rescue!
p <- p + aes(color = factor(cyl))
p <- p + scale_color_brewer(pal = "Set1")
print(p)
# At this point, what would be great is if we could more strongly visually separate
# the cars out by their engine blocks.
p <- p + facet_grid(~ cyl)
print(p)
# Ah! Now we can see (given the fixed scales) that the 4-cylinder cars flock to the
# left on weight measures, while the 8-cylinder cars flock right. But you know what
# would be REALLY awesome? If we could visually compare the means of the car groups.
p.with.means <- p + geom_hline(
aes(yintercept = mean(hp)),
data = mtcars
)
print(p.with.means)
# Wait, that's not right. That's not right at all. The green (8-cylinder) cars are all above the
# average for their group. Are they somehow made in an auto plant in Lake Wobegon, MN? Obviously,
# I meant to draw mean lines factored by GROUP. Except also obviously, since the code below will
# print an error, I don't know how.
p.with.non.lake.wobegon.means <- p + geom_hline(
aes(yintercept = mean(hp) ~ cyl),
data = mtcars
)
print(p.with.non.lake.wobegon.means)
There must be some simple solution I'm missing.必须有一些我缺少的简单解决方案。
You mean something like this:你的意思是这样的:
rs <- ddply(mtcars,.(cyl),summarise,mn = mean(hp))
p + geom_hline(data=rs,aes(yintercept=mn))
It might be possible to do this within the ggplot
call using stat_*
, but I'd have to go back and tinker a bit.可以使用
stat_*
在ggplot
调用中执行此操作,但我必须返回 go 并稍作修改。 But generally if I'm adding summaries to a faceted plot I calculate the summaries separately and then add them with their own geom
.但通常,如果我将摘要添加到多面 plot 我会单独计算摘要,然后将它们与自己的
geom
添加。
EDIT编辑
Just a few expanded notes on your original attempt.只是对您最初尝试的一些扩展说明。 Generally it's a good idea to put
aes
calls in ggplot
that will persist throughout the plot, and then specify different data sets or aesthetics in those geom
's that differ from the 'base' plot.通常,最好将
aes
调用放入ggplot
中,该调用将在整个 plot 中持续存在,然后在与“基础”plot 不同的那些geom
中指定不同的数据集或美学。 Then you don't need to keep specifying data =...
in each geom
.然后你不需要在每个
geom
中继续指定data =...
。
Finally, I came up with a kind of clever use of geom_smooth
to do something similar to what your asking:最后,我想出了一种巧妙地使用
geom_smooth
来做类似于你要求的事情:
p <- ggplot(data = mtcars,aes(x = wt, y = hp, colour = factor(cyl))) +
facet_grid(~cyl) +
geom_point() +
geom_smooth(se=FALSE,method="lm",formula=y~1,colour="black")
The horizontal line (ie constant regression eqn) will only extend to the limits of the data in each facet, but it skips the separate data summary step.水平线(即常数回归 eqn)只会延伸到每个方面的数据限制,但它会跳过单独的数据汇总步骤。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.