简体   繁体   English

R ggplot:如何使用分面 ggplots 定义与组相关的 y 轴中断?

[英]R ggplot: How to define group dependent y-axis breaks using facetted ggplots?

I have 40 groups (defined by short_ID) and would like to produce 40 different plots that use different y-scale breaks for each short_ID.我有 40 个组(由 short_ID 定义)并且想要生成 40 个不同的图,这些图为每个 short_ID 使用不同的 y 尺度中断。 I want the breaks for the y-scale to be (1) mean-2SD, (2) mean and (3) mean+2SD.我希望 y 尺度的中断为 (1) 均值-2SD,(2) 均值和 (3) 均值 + 2SD。

I have a dataset called Dataplots containing my X and Y variables and the grouping variable "short_ID".我有一个名为 Dataplots 的数据集,其中包含我的 X 和 Y 变量以及分组变量“short_ID”。 I have created additional vectors M$SD11 (=mean-2SD), M$mean and M$SD22 (=mean+2SD) to define the breaks and M$short_ID as grouping variable.我创建了额外的向量 M$SD11 (=mean-2SD)、M$mean 和 M$SD22 (=mean+2SD) 来定义中断和 M$short_ID 作为分组变量。 The code below partly works but the problem is that I do not know how to make the breaks group-dependent (ie, dependent on short_ID).下面的代码部分有效,但问题是我不知道如何使中断依赖于组(即依赖于 short_ID)。 When I run the code below I get the same y axis breaks for all plots, namely for example the max of the vector M$SD22 instead of a different M$SD22 value for each plot.当我运行下面的代码时,我得到所有图的相同 y 轴中断,即例如向量 M$SD22 的最大值,而不是每个图的不同 M$SD22 值。 So I think I need to add something to所以我想我需要添加一些东西

"scale_y_continuous(breaks=c(M$SD11, M$mean, M$SD22)", for example "scale_y_continuous(group=M$short_ID, breaks=c(M$SD11, M$mean, M$SD22)" but this does not work.         

Does anybody know what I can do to define different breaks for my different groups (ie, short_IDs)?有人知道我可以做什么来为我的不同组(即 short_ID)定义不同的休息时间吗? How can I change the code below to do this?如何更改下面的代码来做到这一点? Many thanks!非常感谢!

Dataplot <- ggplot(data = Dataplots, aes(x = Measure, y = Amylase_u, group = short_ID)) + geom_line() + facet_wrap(~ short_ID) +  scale_y_continuous(breaks=c(M$SD11, M$mean, M$SD22))

I have added an example of 'Dataplots' and 'M'.我添加了一个“Dataplots”和“M”的例子。 For the purpose of the example I included only two groups (ie, short_IDs) instead of the 40 I actually have.出于示例的目的,我只包含了两个组(即 short_ID),而不是我实际拥有的 40 个。 Thus this example would need to produce 2 plots, one for each short_ID with different y-axis breaks for each of the groups.因此,此示例需要生成 2 个图,每个 short_ID 一个图,每个组的 y 轴中断点不同。

Example of Dataplots:数据图示例:

dput(Dataplots) structure(list(short_ID = c(1111, 1111, 1111, 1111, 2222, 2222, 2222, 2222), Measure = c(1, 2, 3, 4, 1, 2, 3, 4), Amylase_u = c(81.561, 75.648, 145.25, 85.246, 311.69, 261.74, 600.93, 291.39)), .Names = c("short_ID", "Measure", "Amylase_u"), row.names = c(NA, -8L), class = "data.frame", codepage = 65001L)

Example of M: M的例子:

dput(M) structure(list(SD11 = c(162, 682), mean = c(97, 366), SD22 = c(32, 51), short_ID = c(1111, 2222)), .Names = c("SD11", "mean", "SD22", "short_ID"), row.names = 1:2, class = "data.frame")

@Mark I have been trying to apply your suggestions to my complete dataset but cannot seem to get it right. @Mark 我一直在尝试将您的建议应用于我的完整数据集,但似乎无法做到正确。 I have in total 61 plots.我总共有 61 个地块。 I started with:我开始于:

myPlots <-
lapply(unique(Dataplots$short_ID), function(thisID){
Dataplots %>%
  filter(short_ID == thisID) %>%
  ggplot(aes(x = Measure, y = Amylase_u)) +
  geom_line() +
  scale_y_continuous(breaks= M %>%
                       filter(short_ID == thisID) %>%
                       select(mean) %>%
                       as.numeric()
  ) +
  ggtitle(thisID)
 })

(As you can see I decided to go for the subject-mean on the y-axis only and decided to drop the SDs.) I then continued with your final cowplot sugestion: (如您所见,我决定仅在 y 轴上采用主题均值,并决定放弃 SD。)然后我继续您最后的牛图建议:

plot_grid(ggdraw() + draw_label("Amylase_u", angle = 90), plot_grid(
plot_grid(plotlist = lapply(myPlots, function(x){x + theme(axis.title = element_blank())}))
, ggdraw() + draw_label("Measurement")
, ncol = 1
, rel_heights = c(0.9, .1))
, nrow = 1, rel_widths =  c(0.05, 0.95))

This, however, results in 61 plots with the subject-mean on the y-axis but without the Measurements depecited in it (so the graph itself is missing).然而,这会产生 61 个图,y 轴上有主题平均值,但没有测量其中的测量值(因此图表本身丢失)。 I figured there may be a ')' misplaced so I tried:我想可能有一个 ')' 放错了地方,所以我试过:

plot_grid(
ggdraw() + draw_label("Amylase_u", angle = 90)
, plot_grid(
plot_grid(plotlist = lapply(myPlots, function(x){x +theme(axis.title = element_blank())}))
, ggdraw() + draw_label("Measurement")
, ncol = 1
, rel_heights = c(0.9, .1)
, nrow = 1
, rel_widths =  c(0.05, 0.95)))

This does give me graphs but they are tiny and the layout is terrible (Rplot2).这确实给了我图表,但它们很小而且布局很糟糕(Rplot2)。 I tried adapting the rel-heights and widths too but even after reading the help-file don't quite get how I should adapt them.我也尝试调整相对高度和宽度,但即使在阅读帮助文件后也不太明白我应该如何调整它们。

Thanks again!再次感谢!

Rplot2绘图2

Finally, I removed the IDnumbers on top of each plot because they are not really necessary and this already greatly improves the plot (Rplot3), but still the layout needs to be adjusted.最后,我删除了每个图顶部的 IDnumbers,因为它们并不是真正必要的,这已经大大改善了图 (Rplot3),但仍然需要调整布局。

Rplot3绘图3

My understanding is that this still remains impossible in the facet functions.我的理解是,这在facet功能中仍然是不可能的。 However, you can accomplish it yourself using the cowplot package.但是,您可以使用cowplot包自己完成。

First, loop over your ideas (in lapply ) and generate each of the sub-plots you wanted.首先,循环您的想法(在lapply )并生成您想要的每个子图。 Note that I am using dplyr for the pipe and filtering.请注意,我使用dplyr进行管道和过滤。

myPlots <-
  lapply(unique(Dataplots$short_ID), function(thisID){
    Dataplots %>%
      filter(short_ID == thisID) %>%
      ggplot(aes(x = Measure, y = Amylase_u)) +
      geom_line() +
      scale_y_continuous(breaks= M %>%
                           filter(short_ID == thisID) %>%
                           select(SD11, mean, SD22) %>%
                           as.numeric()
                         ) +
      ggtitle(thisID)
  })

Then, call the function plot_grid from cowplot with the list of plots:然后,从带有绘图列表的cowplot调用函数plot_grid

plot_grid(plotlist = myPlots)

gives:给出:

在此处输入图片说明

A few notes:一些注意事项:

  • cowplot autoloads its own default style, so use theme_set to return to your preferred style cowplot自动加载自己的默认样式,因此请使用theme_set返回您喜欢的样式
  • Your included data appear to not actually span all of the thresholds you gave for the y-axis breaks您包含的数据似乎并未真正涵盖您为 y 轴中断提供的所有阈值
  • This should work for an arbitrarily large number of subplots, though you may want/ need to adjust labels and alignment to make them readable.这应该适用于任意数量的子图,尽管您可能想要/需要调整标签和对齐方式以使其可读。

Since I am not sure what your goal is, here is another alternative.由于我不确定您的目标是什么,这是另一种选择。 If you just want to plot deviation from mean (in standard deviations) to make the changes comparable, you could just calculate the z-score of the column within the groups and plot the results.如果您只想绘制与平均值的偏差(以标准差表示)以使更改具有可比性,您只需计算组内列的 z 分数并绘制结果。 Using dplyr again:再次使用dplyr

Dataplots %>%
  group_by(short_ID) %>%
  mutate(scaledAmylase = as.numeric(scale(Amylase_u)) ) %>%
  ggplot(aes(x = Measure
             , y = scaledAmylase)) +
  geom_line() +
  facet_wrap(~short_ID)

gives

在此处输入图片说明

Or, if the mean/SD are calculated/defined somewhere else (and stored in M ) rather than coming directly from the data, you can scale using M instead of the data:或者,如果平均值/标准差是在其他地方计算/定义的(并存储在M )而不是直接来自数据,您可以使用M而不是数据进行缩放:

Dataplots %>%
  left_join(M) %>%
  mutate(scaledAmylase = (Amylase_u - mean) / ((SD22 - mean) / 2) ) %>%
  ggplot(aes(x = Measure
             , y = scaledAmylase)) +
  geom_line() +
  facet_wrap(~short_ID)

gives

在此处输入图片说明

And, because I can't leave well enough alone, here is a version of the plot_grid approach that removes the duplicated axis titles and includes them just once instead (like facet_wrap would).而且,因为我不能单独留下足够好,这里是plot_grid方法的一个版本,它删除了重复的轴标题并只包含它们一次(就像facet_wrap一样)。 As above, increasing the number of subplots or the aspect ratio will force you to tweak the relative values here:如上所述,增加子图的数量或纵横比将迫使您在这里调整相对值:

plot_grid(
  ggdraw() + draw_label("Amylase_u", angle = 90)
  , plot_grid(
    plot_grid(plotlist = lapply(myPlots, function(x){x + theme(axis.title = element_blank())}))
    , ggdraw() + draw_label("Measurement")
    , ncol = 1
    , rel_heights = c(0.9, .1))
  , nrow = 1
  , rel_widths =  c(0.05, 0.95)
 )

gives

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM