My goal is to produce a graph showing the differences between regression lines using continuous vs categorical variables. I'm using is the "SleepStudy" dataset from Lock5Data , and I want to show the regression lines predicting GPA from ClassYear as either continuous or categorical. The code is below:
library(Lock5Data)
data("SleepStudy")
fit2 <- lm(GPA ~ factor(ClassYear), data = SleepStudy)
fit2_line <- aggregate(fit2$fitted.values ~ SleepStudy$ClassYear, FUN = mean)
colnames(fit2_line) <- c('ClassYear','GPA')
options(repr.plot.width=5, repr.plot.height=5)
library(ggplot2)
ggplot() +
geom_line(data=fit2_line, aes(x=ClassYear, y=GPA)) + # Fit line, ClassYear factor
geom_smooth(data=SleepStudy, method='lm', formula=GPA~ClassYear) + # Fit line, ClassYear continuous
geom_point(data=SleepStudy, aes(x=ClassYear, y=GPA)) # Data points as dots
What is producing the blank graph? What am I missing here?
You have to define the data you are using for the geom_smooth
in the ggplot()
. This code works:
ggplot(data=SleepStudy, aes(y = GPA,x = ClassYear)) +
geom_smooth(data=SleepStudy, method='lm', formula=y~x)+
geom_line(data=fit2_line, aes(x=ClassYear, y=GPA)) +
geom_point(data=SleepStudy, aes(x=ClassYear, y=GPA))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.