简体   繁体   中英

Highlight highest residuals in a plot: R

I'm trying to learn how to highlight and annotate some points in the graph. For the purpose of a reproducible example, I'm using UBSprices dataset in alr4 package.

I'm drawing an ols line and ay=x line. I want to highlight and annotate points that are farthest from the OLS line (that is, highest residuals).

Here's my code so far:

ggplot(UBSprices, aes(x = bigmac2003, y = bigmac2009)) + geom_point() + geom_smooth(method = "lm", se = FALSE) + 
  geom_abline(color = "green", size = 1) + coord_fixed()

You could calculate the residuals and then identify those with an absolute value greater than some cutoff quantile. For example:

library(tidyverse)
library(alr4)

UBSprices %>% 
  mutate(resid = resid(lm(bigmac2009 ~ bigmac2003, data = .)),
         mark = abs(resid) >= quantile(abs(resid), prob=0.9)) %>% 
  ggplot(aes(x = bigmac2003, y = bigmac2009)) + 
  geom_point(aes(colour=mark), show.legend=FALSE) + 
  geom_smooth(method = "lm", se = FALSE) + 
  geom_abline(color = "green", size = 1) + 
  coord_fixed() +
  theme_bw() +
  scale_colour_manual(values=c("blue", "red"))

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM