简体   繁体   English

如何在组内连接ggplot中的分组点?

[英]How to connect grouped points in ggplot within groups?

I have a dataset with two groups - Experimental and Control.我有一个包含两组的数据集 - 实验组和控制组。 Each participant contributes two responses per group, which represent different learning styles.每个参与者每组贡献两个响应,代表不同的学习方式。 These are represented in the box plots with jitter below.这些在下面带有抖动的箱形图中表示。 I would like to connect each participant's two responses together with lines using ggplot (so each red line in the Control group would correspond to each turquoise line in the control group), however I can't figure out how to do this within the conditions.我想使用 ggplot 将每个参与者的两个响应与线条连接在一起(因此对照组中的每条红线都对应于对照组中的每条绿松石线),但是我不知道如何在条件下做到这一点。 Can someone please help?有人可以帮忙吗? I am new to R and really need guidance.我是 R 新手,真的需要指导。

Then, I need to change the color of the lines within the conditions to black if Increase = TRUE and red if Increase = FALSE.然后,如果增加 = TRUE,我需要将条件内线条的颜色更改为黑色,如果增加 = FALSE,我需要将线条的颜色更改为红色。

Ideally, I need it to look like Jon's example here, but with black or red lines based on True or False: Connecting grouped points with lines in ggplot理想情况下,我需要它看起来像 Jon 的示例,但使用基于 True 或 False 的黑色或红色线条: Connecting grouped points with lines in ggplot

The data and ggplot looks like this:数据和 ggplot 如下所示:

d <- data.frame (
  Subject = c("1", "2", "3", "4"),
  Group  = c("Exp", "Exp", "Control", "Control"),
  Tr = c("14", "11", "4", "23"),
  Sr = c("56", "78", "12", "10"),
  Increase = c("TRUE", "TRUE", "TRUE", "FALSE")
)

# put the data in long format
d <- d %>%
  gather(key = "Strategy", value = "raw", Tr, Sr)

d %>%
  ggplot(aes(x = Group, y = raw, color = Strategy)) +
  geom_boxplot(width = 0.5, lwd = 0.5) +
  geom_jitter(width = 0.15) +
  geom_line(aes(group = raw),
            color = "grey",
            arrow = arrow(type = "closed",
                          length = unit(0.075, "inches"))) 

Inspired from the answer you linked to - @Jon's answer灵感来自您链接到的答案- @Jon 的答案

There are a few key things to understand the solution有几个关键的事情要理解解决方案

  1. Since you need points and lines to be connected, you need them both to apply the exact same random jitter or it is best to jitter the data before it goes into plotting which is what I did.由于您需要连接点和线,因此您需要它们都应用完全相同的随机抖动,或者最好在数据进入绘图之前抖动数据,这就是我所做的。
  2. Since the variable to apply jitter on is not a number, it is helpful to note that R plots the character vector Group as a factor, interpreted as numbers 1,2,3,.. corresponding to the factor levels.由于要应用抖动的变量不是数字,因此请注意 R 将字符向量Group绘制为一个因子,解释为数字 1、2、3、.. 对应于因子级别。 Hence we create a numeric vector group_jit with values around 1 and 2, with offsets based on the colouring variable Strategy to shift slightly left and right around 1 and 2.因此,我们创建了一个数值向量 group_jit,其值在 1 和 2 左右,其偏移量基于着色变量Strategy以在 1 和 2 左右略微左右移动。
  3. Since you have two independent colour scales going on, it is best to have the Groups represented as fill and the lines represented as colour to avoid a single legend with 4 things on it.由于您有两个独立的色标,因此最好将组表示为fill ,将线条表示为colour ,以避免单个图例上有 4 个东西。

Here's the code -这是代码 -

library(tidyverse)

# Load data
d <- data.frame (
  Subject = c("1", "2", "3", "4"),
  Group  = c("Exp", "Exp", "Control", "Control"),
  Tr = c("14", "11", "4", "23"),
  Sr = c("56", "78", "12", "10"),
  Increase = c("TRUE", "TRUE", "TRUE", "FALSE")
)

width_jitter <- 0.2 # 1 means full width between points

# put the data in long format
d_jit <- d %>%
  gather(key = "Strategy", value = "raw", Tr, Sr) %>% 
  
  # type conversions
  mutate(across(c(Group, Strategy), as_factor)) %>% # convert to factors
  mutate(raw = as.numeric(raw)) %>% # make raw as numbers
  
  # position on x axis is based on combination of Group and jittered Strategy. Mix to taste.
  mutate(group_jit = as.numeric(Group) + jitter(as.numeric(Strategy) - 1.5) * width_jitter * 2,
         grouping = interaction(Subject, Strategy))

# plotting
d_jit %>%
  ggplot(aes(x = Group, y = raw, fill = Strategy)) +
  geom_boxplot(width = 0.5, lwd = 0.5, alpha = 0.05, show.legend = FALSE) +
  geom_point(aes(x = group_jit), size = 3, shape = 21) +
  
  geom_line(aes(x = group_jit,
                group = Subject,
                colour = Increase),
            alpha = 0.5,
            arrow = arrow(type = "closed",
                          length = unit(0.075, "inches"))
            ) + 
  scale_colour_manual(values = c('red', 'black'))

Created on 2022-05-14 by the reprex package (v2.0.1)reprex 包于 2022-05-14 创建 (v2.0.1)

For completeness sake, a different, and more elegant way to to do the jitter is to give a position argument to the geom_point and geom_line commands.为了完整起见,进行抖动的另一种更优雅的方法是为geom_pointgeom_line命令提供position参数。 This argument is a function which adds the random jitter like this (source: @erocoar's answer )这个参数是一个像这样添加随机抖动的函数(来源: @erocoar's answer

position = ggplot2::position_jitterdodge(dodge.width = 0.75, jitter.width = 0.3, seed = 1)

This way the data itself is not changed and the plotting takes care of the jittering details这样数据本身不会改变,并且绘图会处理抖动细节

  • jitterdodge does the dodge (shift for the x axis variable) and jitter (small noise for the coloured points) jitterdodge进行闪避(x 轴变量的偏移)和抖动(彩色点的小噪声)
  • The seed argument here is key since it ensures that the same random values are returned for the point and the line functions that call it independently这里的seed参数是关键,因为它确保为独立调用它的点和线函数返回相同的随机

Not a direct answer to your question, but I wanted to suggest an alternative visualisation.不是您的问题的直接答案,但我想建议一种替代的可视化。

You are dealing with paired data.您正在处理配对数据。 A much more convincing visualisation is achieved with a scatter plot.使用散点图可以实现更令人信服的可视化。 You will use the two dimensions of your paper rather than mapping your two dimensions onto only one.您将使用论文的两个维度,而不是将您的两个维度仅映射到一个维度上。 You can compare control with subjects better and see immediately which one got better or worse.您可以更好地将控制与受试者进行比较,并立即查看哪一个变得更好或更糟。

library(tidyverse)

d <- data.frame (
  Subject = c("1", "2", "3", "4"),
  Group  = c("Exp", "Exp", "Control", "Control"),
  Tr = c("14", "11", "4", "23"),
  Sr = c("56", "78", "12", "10"),
  Increase = c("TRUE", "TRUE", "TRUE", "FALSE")
)  %>%
## convert to numeric first
mutate(across(c(Tr,Sr), as.integer))

## set coordinate limits
lims <- range(c(d$Tr, d$Sr))

ggplot(d) +
  geom_point(aes(Tr, Sr, color = Group)) +
## adding a line of equality and setting limits equal helps guide the eye
  geom_abline(intercept = 0, slope = 1, lty = "dashed") +
  coord_equal(xlim = lims , ylim = lims )

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM