I am trying to add a linear regression model to my plot. I have this data frame:
watershed sqm cfs
3 deerfieldwatershed 1718617392 22703.8851
5 greenwatershed 233458430 1637.4895
6 northwatershed 240348182 3281.9921
8 southwatershed 68031782 867.6428
and my current code is:
ggplot(dischargevsarea, aes(x = sqm, y = cfs, color = watershed)) +
geom_point(aes(color = watershed), size = 2) +
labs(y= "Discharge (cfs)", x = "Area (sq. m)", color = "Watershed") +
scale_color_manual(values = c("#BAC4C1", "#37B795",
"#00898F", "#002245"),
labels = c("Deerfield", "Green", "North",
"South")) +
theme_minimal() +
geom_smooth(method = "lm", se = FALSE)
Which, when it runs, adds a line to the points in the legend, but does not show up on the graph (see image below). I suspect it is drawing a line individually for each point, but I want one regression line for all four points. How would I get the line I want to show up? Thanks.
You're right, it is because your points are grouped in different categories (because of the color
in your first aes
), so when you call geom_smooth
, it will make a regression line for each categories and in your example, it means for each single point. So, that's why you don't have a single regression line.
To get a regression line for all points, you can pass the color
argument only in the aes
of geom_point
(or you can use inherit.aes = FALSE
in geom_smooth
to indicate to ggplot
to not consider previous mapping arguments and fill it with new arguments).
To display the equation on the graph (based on your question in comments), you can have the use of the stat_poly_eq
function from the ggpmisc
package (here a SO post describing its use: Add regression line equation and R^2 on graph ):
library(ggplot2)
library(ggpmisc)
ggplot(df, aes(x = sqm, y = cfs)) +
labs(y= "Discharge (cfs)", x = "Area (sq. m)", color = "Watershed") +
scale_color_manual(values = c("#BAC4C1", "#37B795",
"#00898F", "#002245"),
labels = c("Deerfield", "Green", "North",
"South")) +
theme_minimal() +
geom_smooth(method = "lm", se = FALSE, formula = y~x)+
stat_poly_eq(formula = y~x, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE)+
geom_point(aes(color = watershed))
Data
structure(list(watershed = c("deerfieldwatershed", "greenwatershed",
"northwatershed", "southwatershed"), sqm = c(1718617392L, 233458430L,
240348182L, 68031782L), cfs = c(22703.8851, 1637.4895, 3281.9921,
867.6428)), row.names = c(NA, -4L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x55ef09764350>)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.