简体   繁体   中英

Using ggplot2 to plot an already-existing linear model

Let's say that I have some data and I have created a linear model to fit the data. Then I plot the data using ggplot2 and I want to add the linear model to the plot. As far as I know, this is the standard way of doing it (using the built-in cars dataset):

library(ggplot2)
fit <- lm(dist ~ speed, data = cars)
summary(fit)
p <- ggplot(cars, aes(speed, dist))
p <- p + geom_point()
p <- p + geom_smooth(method='lm')
p

However, the above violates the DRY principle ('don't repeat yourself'): it involves creating the linear model in the call to lm and then recreating it in the call to geom_smooth . This seems inelegant to me, and it also introduces a space for bugs. For example, if I change the model that is created with lm but forget to change the model that is created with geom_smooth , then the summary and the plot won't be of the same model.

Is there a way of using ggplot2 to plot an already existing linear model, eg by passing the lm object itself to the geom_smooth function?

What one needs to do is to create a new data frame with the observations from the old one plus the predicted values from the model, then plot that dataframe using ggplot2.

library(ggplot2)

# create and summarise model
cars.model <- lm(dist ~ speed, data = cars)
summary(cars.model) 

# add 'fit', 'lwr', and 'upr' columns to dataframe (generated by predict)
cars.predict <- cbind(cars, predict(cars.model, interval = 'confidence'))

# plot the points (actual observations), regression line, and confidence interval
p <- ggplot(cars.predict, aes(speed,dist))
p <- p + geom_point()
p <- p + geom_line(aes(speed, fit))
p <- p + geom_ribbon(aes(ymin=lwr,ymax=upr), alpha=0.3)
p

The great advantage of doing this is that if one changes the model (eg cars.model <- lm(dist ~ poly(speed, 2), data = cars) ) then the plot and the summary will both change.

Thanks to Plamen Petrov for making me realise what was needed here. As he points out, this approach will only work if predict is defined for the model in question; if not, one has to define it oneself.

I believe you want to do something along the lines of :

library(ggplot2)

# install.packages('dplyr')
library(dplyr)

fit <- lm(dist ~ speed, data = cars)

cars %>%
  mutate( my_model = predict(fit) ) %>%
  ggplot() +
  geom_point( aes(speed, dist) ) +
  geom_line( aes(speed, my_model)  )

This will also work for more complex models as long as the corresponding predict method is defined. Otherwise you will need to define it yourself.

In the case of linear model you can add the confidence/prediction bands with slightly more work and reproduce your plot.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM