tidyverse/ggplot2：按手動比例中使用的因子進行子集化？

Question

我有一個龐大而復雜的數據集，但重要的部分歸結為類似於以下內容：

my_df <- data.frame(Expt = rep(c("Expt1", "Expt2", "Expt3", "Expt4"), each = 96),
                  ExpType = rep(c("A", "B"), each = 192),
                  Treatment = c(rep("T1", 192), rep("T2", 144), rep("T1", 48)),
                  Subject = c(rep(c("S01", "S02", "S03", "S04", "S05", "S06", "S07", "S08"), 24), rep("S01", 96), rep("S06", 96)),
                  xvar = as.factor(rep(rep(c(10, 5, 2.5, 1.25, 0.6, 0.3, 0.16, 0.08, 0.04, 0.02, 0, "NA"), each = 8),  4)),
                  yvar = runif(384))

我目前正在按 ExpType 和 Treatment 對我的數據進行分組，計算一些匯總統計數據，然后繪制圖形，如下所示：

myplots <- my_df %>%
  group_by(ExpType, Treatment) %>%  #  took out Include because I'm using the versions with no questionable data
  nest() %>%

  mutate(sumstats = map(
    .x = data,
    ~.x %>%
      group_by(Subject, xvar) %>%
      summarize(
        my_mean = mean(yvar, na.rm = TRUE)
      )))  %>%

  mutate(plots1 = map2(
    .x = data,
    .y = sumstats,
    ~ggplot(data = .x) +
      theme_classic() +
      scale_shape_manual(name = "Subject", values = c("S01" = 23, "S02" = 24, "S03" = 21, "S04" = 21, "S05" = 22, "S06" = 22, "S07" = 24, "S08" = 25)) + 
      scale_linetype_manual(name = "Subject", values = c("S01" = "solid", "S02" = "dotted", "S03" = "dotted", "S04" = "solid", "S05" = "dotted", "S06" = "dashed", "S07" = "solid", "S08" = "dashed")) +
      scale_fill_manual(name = "Subject", values = c("S01" = "#AA4499", "S02" = "#882255", "S03" = "#CC6677", "S04" = "#DDCC77", "S05" = "#999933", "S06" = "#117733", "S07" = "#44AA99", "S08" = "#88CCEE")) +
      scale_color_manual(name = "Subject", values = c("S01" = "#AA4499", "S02" = "#882255", "S03" = "#CC6677", "S04" = "#DDCC77", "S05" = "#999933", "S06" = "#117733", "S07" = "#44AA99", "S08" = "#88CCEE")) +
      geom_line(data = .y, aes(x=xvar, y = my_mean, group=Subject,  color=Subject, linetype = Subject)) +
      geom_point(aes(x=xvar, y = yvar, group=Subject, fill=Subject, shape = Subject), size = 2.5)

  ))

walk(.x = myplots$plots1,  ~print(.x))

這很棒，但我有足夠多的主題，很難看出發生了什么，我希望能夠為每個主題制作單獨的圖表。 我可以按主題分面，但是它們太多了，以至於圖表非常小，很難看出發生了什么。

我在哪里/如何添加這個額外的子集步驟，並且仍然將因子傳遞給手動比例？

Answer 1

解決方案1

從每個嵌套數據集中訪問 Subject 的值同時將其作為嵌套外部的分組變量的一種簡單方法是使用相同的值創建一個重復的列：

# define manual scale palettes outside for easy reusability
shape.pal <- c(23, 24, 21, 21, 22, 22, 24, 25)
linetype.pal <- c("solid", "dotted", "dotted", "solid", "dotted", "dashed", "solid", "dashed")
fill.pal <- c("#AA4499", "#882255", "#CC6677", "#DDCC77", "#999933", "#117733", "#44AA99", "#88CCEE")
names(shape.pal) <- names(linetype.pal) <- names(fill.pal) <- paste0("S0", seq(1, 8))

myplots <- my_df %>%
  mutate(Subject2 = Subject) %>% # add a duplicate column for subject
  group_by(ExpType, Treatment, Subject2) %>% # don't next duplicate subject column
  nest() %>%

  mutate(sumstats = map(
    .x = data,
    ~.x %>%
      group_by(Subject, xvar) %>%
      summarize(
        my_mean = mean(yvar, na.rm = TRUE)
      )))  %>%

  mutate(plots1 = map2(
    .x = data,
    .y = sumstats,
    ~ggplot(data = .x, 
            aes(x = xvar, y = yvar, group = Subject)) +
      geom_line(data = .y, 
                aes(y = my_mean, color = Subject, linetype = Subject)) +
      geom_point(aes(fill = Subject, shape = Subject), 
                 size = 2.5) +
      labs(shape = "Subject", linetype = "Subject", fill = "Subject", colour = "Subject") +
      scale_shape_manual(values = shape.pal) + 
      scale_linetype_manual(values = linetype.pal) +
      scale_fill_manual(values = fill.pal, aesthetics = c("fill", "color")) +
      theme_classic()
  ))

walk(.x = myplots$plots1,  ~print(.x))

結果圖對應於 ExpType = A，治療 = T1，主題 = S01：

解決方案2

更一般地，如果在創建 ggplot 對象時使用pmap而不是map2 ，則可以訪問分組變量，因為pmap允許您同時映射 2 個以上的輸入。

下面的演示，它也映射了 ExpType & Treatment 以反映每個圖在其圖標題中分組變量值的組合：

myplots <- my_df %>%
  group_by(ExpType, Treatment, Subject) %>% # use subject as a grouping variable
  group_nest() %>%

  mutate(sumstats = map(
    .x = data,
    ~.x %>%
      group_by(xvar) %>% # not grouping by subject here, as it's not a column in data
      summarize(
        my_mean = mean(yvar, na.rm = TRUE)
      )))  %>%

  mutate(plots1 = pmap(
    .l = list(x = data,
              y = sumstats,
              s = as.character(Subject),
              z = as.character(ExpType),
              w = as.character(Treatment)),
    function(x, y, s, z, w) ggplot(data = x, aes(x = xvar, group = s)) +
      geom_line(data = y, 
                aes(y = my_mean, color = s, linetype = s)) +
      geom_point(aes(y = yvar, fill = s, shape = s), 
                 size = 2.5) +
      labs(title = paste0("ExpType: ", z, ", Treatment: ", w),
           shape = "Subject", linetype = "Subject", fill = "Subject", colour = "Subject") +
      scale_shape_manual(values = shape.pal) + 
      scale_linetype_manual(values = linetype.pal) +
      scale_fill_manual(values = fill.pal, aesthetics = c("fill", "colour")) +
      theme_classic()     
  ))

walk(.x = myplots$plots1,  ~print(.x))

結果圖對應於 ExpType = A，治療 = T1，主題 = S01：

tidyverse/ggplot2：按手動比例中使用的因子進行子集化？

問題描述

1 個解決方案

解決方案1
1 已采納 2019-09-11 04:18:06

解決方案1

解決方案2

tidyverse/ggplot2：按手動比例中使用的因子進行子集化？

問題描述

1 個解決方案

解決方案1 1 已采納 2019-09-11 04:18:06

解決方案1

解決方案2

解決方案1
1 已采納 2019-09-11 04:18:06