简体   繁体   中英

Separate regression lines overlaid in ggplot2

I'm trying to replicate the plot I'm showing below without success. The thick red regression lines should come from the int3 data.frame. The thin grey regression lines should come from int2 data.frame.

Is there a fix to achieve my desired plot?

library(lme4)
library(tidyverse)

dd <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/3.csv')

m31 <- lmer(math~year+(1|schoolid/childid), data = dd)

co <- coef(m31)

int2 = co$`childid:schoolid`
int2$ch_id <- substr(rownames(int2), 11, 14)
  
int3 = co$schoolid
int3$sch_id <- rownames(int3)


ggplot(data = dd, aes(x = year, y = math, group = factor(childid)))+ geom_point()+ facet_wrap(~schoolid)+
geom_abline(data = int2, aes(intercept=`(Intercept)`, slope=year)) +  
  geom_abline(data = int3, aes(intercept=`(Intercept)`, slope=year), color=2)

在此处输入图像描述

You need to add columns corresponding to the group variable and the faceting variable to the data frames int2 and int3 so that the regression lines will only appear on the appropriate panel and will be split appropriately by group.

It looks like this was partially done in your example, but incorrectly because the ID columns were given different names. They should have the same names as the variables in dd .

int2 = co$`childid:schoolid`
int2$childid <- substr(rownames(int2), 1, 9)
int2$schoolid <- substr(rownames(int2), 11, 14)

int3 = co$schoolid
int3$schoolid <- rownames(int3)


ggplot(data = dd, aes(x = year, y = math, group = factor(childid)))+ geom_point()+ facet_wrap(~schoolid)+
  geom_abline(data = int2, aes(intercept=`(Intercept)`, slope=year)) +  
  geom_abline(data = int3, aes(intercept=`(Intercept)`, slope=year), color=2)

在此处输入图像描述

So far so good. But it looks like in your example plot the regression lines have the same range of x-values as the data. To fix that, you would need to replace geom_abline with geom_segment . I calculated the range of x values needed for each schoolid and childid , and found the endpoints of the segments that way.

# Calculate range of values for schoolid and childid.
year_range_by_schoolid <- dd %>% group_by(schoolid) %>% summarize(min_year = min(year), max_year = max(year))
year_range_by_childid <- dd %>% group_by(schoolid, childid) %>% summarize(min_year = min(year), max_year = max(year))

# Find min and max points for each line.
int2 <- int2 %>% 
  mutate(schoolid = as.integer(schoolid),
         childid = as.integer(childid)) %>%
  left_join(year_range_by_childid) %>%
  mutate(min_y = `(Intercept)` + year * min_year,
         max_y = `(Intercept)` + year * max_year)

int3 <- int3 %>% 
  mutate(schoolid = as.integer(schoolid)) %>%
  left_join(year_range_by_schoolid) %>%
  mutate(min_y = `(Intercept)` + year * min_year,
         max_y = `(Intercept)` + year * max_year)


ggplot(data = dd, aes(x = year, y = math, group = factor(childid)))+ geom_point()+ facet_wrap(~schoolid)+
  geom_segment(data = int2, aes(x = min_year, y = min_y, xend = max_year, yend = max_y)) +  
  geom_segment(data = int3, aes(x = min_year, y = min_y, xend = max_year, yend = max_y, group = factor(schoolid)), color=2)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM