简体   繁体   中英

Adding regression line to graph

I am trying to add a linear regression model to my plot. I have this data frame:

           watershed        sqm        cfs
3 deerfieldwatershed 1718617392 22703.8851
5     greenwatershed  233458430  1637.4895
6     northwatershed  240348182  3281.9921
8     southwatershed   68031782   867.6428

and my current code is:

ggplot(dischargevsarea, aes(x = sqm, y = cfs, color = watershed)) + 
  geom_point(aes(color = watershed), size = 2) + 
  labs(y= "Discharge (cfs)", x = "Area (sq. m)", color = "Watershed") + 
  scale_color_manual(values = c("#BAC4C1", "#37B795", 
                                "#00898F", "#002245"),
                     labels = c("Deerfield", "Green", "North",
                                "South")) + 
  theme_minimal() + 
  geom_smooth(method = "lm", se = FALSE)

Which, when it runs, adds a line to the points in the legend, but does not show up on the graph (see image below). I suspect it is drawing a line individually for each point, but I want one regression line for all four points. How would I get the line I want to show up? Thanks. 流量与 4 个流域面积的关系图

You're right, it is because your points are grouped in different categories (because of the color in your first aes ), so when you call geom_smooth , it will make a regression line for each categories and in your example, it means for each single point. So, that's why you don't have a single regression line.

To get a regression line for all points, you can pass the color argument only in the aes of geom_point (or you can use inherit.aes = FALSE in geom_smooth to indicate to ggplot to not consider previous mapping arguments and fill it with new arguments).

To display the equation on the graph (based on your question in comments), you can have the use of the stat_poly_eq function from the ggpmisc package (here a SO post describing its use: Add regression line equation and R^2 on graph ):

library(ggplot2)
library(ggpmisc)
ggplot(df, aes(x = sqm, y = cfs)) +
  labs(y= "Discharge (cfs)", x = "Area (sq. m)", color = "Watershed") + 
  scale_color_manual(values = c("#BAC4C1", "#37B795", 
                                "#00898F", "#002245"),
                     labels = c("Deerfield", "Green", "North",
                                "South")) + 
  theme_minimal() + 
  geom_smooth(method = "lm", se = FALSE, formula = y~x)+
  stat_poly_eq(formula = y~x, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), 
               parse = TRUE)+
  geom_point(aes(color = watershed))

在此处输入图片说明

Data

structure(list(watershed = c("deerfieldwatershed", "greenwatershed", 
"northwatershed", "southwatershed"), sqm = c(1718617392L, 233458430L, 
240348182L, 68031782L), cfs = c(22703.8851, 1637.4895, 3281.9921, 
867.6428)), row.names = c(NA, -4L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x55ef09764350>)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM