简体   繁体   中英

ggplot2 - align overlayed points in center of boxplot, and connect the points with lines

I am working on a boxplot with points overlayed and lines connecting the points between two time sets, example data provided below.

I have two questions:

  1. I would like the points to look like this, with just a little height jitter and more width jitter. However, I want the points to be symmetrically centered around the middle of the boxplot on each y axis label (to make the plots more visually pleasing). For example, I would like the 6 datapoints at y = 4 and x = "after to be placed 3 to the right of the boxplot center and 3 to the left of the center, at symmetrical distances from the center.

  2. Also, I want the lines to connect with the correct points, but now the lines start and end in the wrong places. I know I can use position = position_dodge() in geom_point() and geom_line() to get the correct positions, but I want to be able to adjust the points by height also (why do the points and lines align with position_dodge() but not with position_jitter?).

Are these to things possible to achieve?

Thank you!

examiner <- rep(1:15, 2)
time <- rep(c("before", "after"), each = 15)
result <- c(1,3,2,3,2,1,2,4,3,2,3,2,1,3,3,3,4,4,5,3,4,3,2,2,3,4,3,4,4,3)
data <- data.frame(examiner, time, result)

ggplot(data, aes(time, result, fill=time)) + 
  geom_boxplot() +
  geom_point(aes(group = examiner), 
             position = position_jitter(width = 0.2, height = 0.03)) +
  geom_line(aes(group = examiner), 
            position = position_jitter(width = 0.2, height = 0.03), alpha = 0.3)

I'm not sure that you can satisfy both of your questions together.

  1. You can have a more "symmetric" jitter by using a geom_dotplot, as per:
ggplot(data, aes(time, result, fill=time)) + 
  geom_boxplot() +
  geom_dotplot(binaxis="y", aes(x=time, y=result, group = time), 
             stackdir = "center", binwidth = 0.075)

The problem is that when you add the lines, they will join at the original, un-jittered points.

  1. To join jittered points with lines that map to the jittered points, the jitter can be added to the data before plotting. As you saw, jittering both ends up with points and lines that don't match. See Connecting grouped points with lines in ggplot for a better explanation.
library(dplyr)

data <- data %>% 
  mutate(result_jit = jitter(result, amount=0.1),
         time_jit = jitter(case_when(
           time == "before" ~ 2,
           time == "after" ~ 1
         ), amount=0.1)
  )
         

ggplot(data, aes(time, result, fill=time)) + 
  geom_boxplot() +
  geom_point(aes(x=time_jit, y=result_jit, group = examiner)) +
  geom_line(aes(x=time_jit, y=result_jit, group = examiner), alpha=0.3)

Result

It is possible to extract the transformed points from the geom_dotplot using ggplot_build() - see Is it possible to get the transformed plot data? (eg coordinates of points in dot plot, density curve)

These points can be merged onto the original data, to be used as the anchor points for the geom_line.

Putting it all together:

library(dplyr)
library(ggplot2)

examiner <- rep(1:15, 2)
time <- rep(c("before", "after"), each = 15)
result <- c(1,3,2,3,2,1,2,4,3,2,3,2,1,3,3,3,4,4,5,3,4,3,2,2,3,4,3,4,4,3)

# Create a numeric version of time
data <- data.frame(examiner, time, result) %>% 
  mutate(group = case_when(
           time == "before" ~ 2,
           time == "after" ~ 1)
  )

# Build a ggplot of the dotplot to extract data
dotpoints <- ggplot(data, aes(time, result, fill=time)) + 
  geom_dotplot(binaxis="y", aes(x=time, y=result, group = time), 
               stackdir = "center", binwidth = 0.075)

# Extract values of the dotplot
dotpoints_dat <- ggplot_build(dotpoints)[["data"]][[1]] %>% 
  mutate(key = row_number(),
         x = as.numeric(x),
         newx = x + 1.2*stackpos*binwidth/2) %>% 
  select(key, x, y, newx)

# Join the extracted values to the original data
data <- arrange(data, group, result) %>% 
  mutate(key = row_number())
newdata <- inner_join(data, dotpoints_dat, by = "key") %>% 
  select(-key)

# Create final plot
ggplot(newdata, aes(time, result, fill=time)) + 
  geom_boxplot() +
  geom_dotplot(binaxis="y", aes(x=time, y=result, group = time), 
               stackdir = "center", binwidth = 0.075) +
  geom_line(aes(x=newx, y=result, group = examiner), alpha=0.3)

Result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM