简体   繁体   中英

Connecting mean points of a line plot in ggplot2

I have a sample dataset with the columns: PATIENTID (IDs of patients), VISITNUMBER (their number of visits to the hospital), TIME (time in years since first visit), HEALTH (their health status). I am trying to plot HEALTH over time.

This is my code in R:

# data structure
PATIENTID <- c(126, 126, 126, 255, 255, 389, 389, 389, 389, 389, 470, 470, 470)
VISITNUMBER <- c(1, 2, 3, 1, 2, 1, 2, 3, 4, 5, 1, 2, 3)
TIME<- c(0, 4, 6, 0, 3, 0, 1, 2, 3, 4, 0, 1, 2)
HEALTH <- c(0.333, 0.452, 0.468, 0.571, 0.522, 0.444, 0.452, 0.431, 0.510, 0.532, 0.214, 0.333, 0.400)

mydata <- data.frame(PATIENTID, VISITNUMBER, TIME, HEALTH)


# converting patient ID and visit number to factor 

mydata$PATIENTID   <- factor(mydata$PATIENTID)
mydata$VISITNUMBER <- factor(mydata$VISITNUMBER)

# creating a spagetti plot of health over time 

sp_HEALTH <- ggplot(data = mydata, aes(TIME, HEALTH, group=PATIENTID))
sp_HEALTH + 
  geom_line() + 
  stat_smooth(aes(group=1), method = "lm", se = FALSE) + 
  stat_summary(aes(group=1), geom = "point", fun.y = mean, 
               shape = 17, size = 3, col = "red")

This is my plot that's generated as a result of this code:

意大利面情节

My issue is that I am trying to figure out a way to connect the mean points (shown in red in the above link) using a blue line that goes from point to point but I get this straight regression type of line. I want it to be like how a regular line plot connects points using lines (please click link below). How do I insert a line that connects the mean points?

采样线图

Thank you!

Perhaps easier to use dplyr::mutate to calculate the mean, then add separate geoms for patient and mean values?

library(dplyr)
library(ggplot2)

mydata %>% 
  mutate(PATIENTID = factor(PATIENTID)) %>% 
  group_by(TIME) %>% 
  mutate(MEAN = mean(HEALTH)) %>% 
  ungroup() %>% 
  ggplot() + 
  geom_line(aes(TIME, HEALTH, group = PATIENTID)) + 
  geom_line(aes(TIME, MEAN), color = "blue") + 
  geom_point(aes(TIME, MEAN), color = "red", size = 3, shape = 17)

Or you could just add a second stat_summary with geom = "line" . Note in both cases how aes() is used in the geom, not the ggplot() .

mydata %>% 
  ggplot() +
  geom_line(aes(TIME, HEALTH, group=PATIENTID)) + 
  stat_summary(aes(TIME, HEALTH), geom = "point", fun = mean, shape = 17, size = 3, col = "red") + 
  stat_summary(aes(TIME, HEALTH), geom = "line",  fun = mean, col = "blue")

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM