[英]ggplot2 geom_smooth, extended model for method=lm
I would like to use geom_smooth
to get a fitted line from a certain linear regression model. 我想使用
geom_smooth
从某个线性回归模型中获取拟合线。
It seems to me that the formula can only take x
and y
and not any additional parameter. 在我看来,公式只能采用
x
和y
而不是任何其他参数。
To show more clearly what I want: 为了更清楚地显示我想要的东西:
library(dplyr)
library(ggplot2)
set.seed(35413)
df <- data.frame(pred = runif(100,10,100),
factor = sample(c("A","B"), 100, replace = TRUE)) %>%
mutate(
outcome = 100 + 10*pred +
ifelse(factor=="B", 200, 0) +
ifelse(factor=="B", 4, 0)*pred +
rnorm(100,0,60))
With 同
ggplot(df, aes(x=pred, y=outcome, color=factor)) +
geom_point(aes(color=factor)) +
geom_smooth(method = "lm") +
theme_bw()
I produce fitted lines that, due to the color=factor
option, are basically the output of the linear model lm(outcome ~ pred*factor, df)
由于
color=factor
选项,我生成的拟合线基本上是线性模型lm(outcome ~ pred*factor, df)
的输出lm(outcome ~ pred*factor, df)
In some cases, however, I prefer the lines to be the output of a different model fit, like lm(outcome ~ pred + factor, df)
, for which I can use something like: 但是,在某些情况下,我更喜欢将线条作为不同模型拟合的输出,例如
lm(outcome ~ pred + factor, df)
,我可以使用以下内容:
fit <- lm(outcome ~ pred+factor, df)
predval <- expand.grid(
pred = seq(
min(df$pred), max(df$pred), length.out = 1000),
factor = unique(df$factor)) %>%
mutate(outcome = predict(fit, newdata = .))
ggplot(df, aes(x=pred, y=outcome, color=factor)) +
geom_point() +
geom_line(data = predval) +
theme_bw()
which results in : 这导致:
My question: is there a way to produce the latter graph exploiting the geom_smooth
instead? 我的问题:有没有办法生成后一个利用
geom_smooth
图? I know there is a formula =
- option in geom_smooth
but I can't make something like formula = y ~ x + factor
or formula = y ~ x + color
(as I defined color = factor
) work. 我知道在
geom_smooth
有一个formula =
- 选项,但是我不能做出类似于formula = y ~ x + factor
或formula = y ~ x + color
(我定义的color = factor
)的工作。
This is a very interesting question. 这是一个非常有趣的问题。 Probably the main reason why
geom_smooth
is so "resistant" to allowing custom models of multiple variables is that it is limited to producing 2-D curves; geom_smooth
对于允许多变量的自定义模型如此“抵抗”的主要原因可能是它仅限于生成二维曲线; consequently, its arguments are designed for handling two-dimensional data (ie formula = response variable ~ independent variable). 因此,其参数设计用于处理二维数据(即公式=响应变量〜自变量)。
The trick to getting what you requested is using the mapping
argument within geom_smooth
, instead of formula
. 获得所需内容的技巧是使用
geom_smooth
的mapping
参数,而不是formula
。 As you've probably seen from looking at the documentation , formula
only allows you to specify the mathematical structure of the model (eg linear, quadratic, etc.). 正如您在查看文档时看到的那样,
formula
只允许您指定模型的数学结构(例如线性,二次等)。 Conversely, the mapping
argument allows you to directly specify new y-values - such as the output of a custom linear model that you can call using predict()
. 相反,
mapping
参数允许您直接指定新的y值 - 例如可以使用predict()
调用的自定义线性模型的输出。
Note that, by default, inherit.aes
is set to TRUE
, so your plotted regressions will be coloured appropriately by your categorical variable. 请注意,默认情况下,
inherit.aes
设置为TRUE
,因此您绘制的回归将由分类变量适当地着色。 Here's the code: 这是代码:
# original plot
plot1 <- ggplot(df, aes(x=pred, y=outcome, color=factor)) +
geom_point(aes(color=factor)) +
geom_smooth(method = "lm") +
ggtitle("outcome ~ pred") +
theme_bw()
# declare new model here
plm <- lm(formula = outcome ~ pred + factor, data=df)
# plot with lm for outcome ~ pred + factor
plot2 <-ggplot(df, aes(x=pred, y=outcome, color=factor)) +
geom_point(aes(color=factor)) +
geom_smooth(method = "lm", mapping=aes(y=predict(plm,df))) +
ggtitle("outcome ~ pred + factor") +
theme_bw()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.