简体   繁体   English

为数据框中的每一列制作一个包含 4 个箱线图的单独输出

[英]Make a separate output containing 4 boxplots for each column in a dataframe

I have a dataframe, df , with 15 different data columns.我有一个数据框df ,有 15 个不同的数据列。 There are 1162 rows and each row is part of one of four groups.有 1162 行,每行是四个组之一的一部分。 Here is a snippet of df :这是df的片段:

   var1 var2 var3      var4  group
1   2.4   68  310 0.6192056 group1
2   2.7   66  305 0.7287633 group1
3   7.1   73  297 0.5105544 group1
4   2.4   69  295 0.6198897 group1
5   3.4   74  283 0.6399971 group1
6   3.7   73  310 0.7949742 group1
7   3.0   65  281 0.7783598 group1
8   4.6   76  304 0.6377989 group1
9   3.0   64  311 0.7020734 group1
10  3.0   62  307 0.7437677 group1
11  2.5   72  306 0.6953473 group1
12  2.5   74  309 0.5672044 group1
13  2.7   34  285 0.7941913 group1
14  3.2   74  335 0.6661719 group2
15  2.5   72  305 0.5500999 group2
16  2.5   69  336 0.8476085 group2
17  2.8   63  289 0.6646900 group2
18  1.3   52  270 0.6448988 group2
19  2.9   70  334 0.7123679 group2
20  2.5   69  308 0.5418768 group2
21  2.4   73  307 0.5108490 group2
22  2.3   45  290 0.8393499 group2
23  2.3   66  283 0.5413923 group2
24  2.5   31  285 0.7542300 group2
25  2.6   68  332 0.7872316 group2
26  3.8   49  292 0.7775340 group2
27  2.4   66  294 0.6445523 group3
28  2.5   76  314 0.7265084 group3
29  7.5   80  314 0.6255964 group3
30  2.3   70  303 0.4487150 group3
31  2.2   53  426 0.8706240 group3
32  2.5   42  295 0.2243240 group3
33  2.4   66  320 0.5563342 group3
34  2.9   59  289 0.6899643 group3
35  2.5   24  280 0.7351417 group3
36  2.5   59  281 0.6295490 group3
37  2.9   68  305 0.6660455 group3
38  2.6   64  267 0.5632927 group3
39  2.6   58  283 0.6810814 group3
40  2.4   71  290 0.6878466 group4
41  2.5   59  302 0.6488055 group4
42  2.5   69  306 0.6815277 group4
43  2.6   56  297 0.5262509 group4
44  2.9   65  302 0.6239796 group4
45  2.5   63  302 0.7206896 group4
46  2.4   66  306 0.5208803 group4
47  2.8   59  293 0.7122809 group4
48  2.6   72  306 0.7217113 group4
49  2.7   56  293 0.7916376 group4
50  2.5   52  292 0.7499101 group4
51  2.7   76  313 0.6795473 group4
52  3.2   80  299 0.2475061 group4

This is the code I'm using to generate a boxplot for each group for just one of the columns (var1 in this example):这是我用来为每个组只为其中一列生成箱线图的代码(本例中为 var1):

df %>% ggplot( aes(x=group, y=var1 )) +
  geom_boxplot(outlier.size = 0) +
  geom_jitter() +
  ggtitle("Var1") +
  xlab("Group") +
  ylab("Var1") +
  theme(plot.title = element_text(hjust=0.5), axis.text.x=element_text(angle=45, hjust=1))

I want to create a separate output (either a separate pdf or just a separate page in a pdf) for each column in the dataframe.我想为数据框中的每一列创建一个单独的输出(单独的 pdf 或只是 pdf 中的单独页面)。 Each output will contain a boxplot for the 4 separate groups.每个输出将包含 4 个单独组的箱线图。 I know that I could just copy and paste the above code and replace y=var1 with the other column names or I could use a for loop but is there an easier/more efficient way to do this?我知道我可以复制并粘贴上面的代码并将y=var1替换为其他列名,或者我可以使用for循环但是有更简单/更有效的方法吗?

We can reshape to 'long' format and plot at once in a single page with facet_wrap我们可以使用facet_wrap重塑为“长”格式并在单个页面中一次绘制

library(dplyr)
library(tidyr)
library(ggplot2)
df %>%
  pivot_longer(cols = starts_with('var'), names_to = 'var_grp') %>% 
  ggplot(aes(x = group, y = value, fill = var_grp)) + 
  geom_boxplot(outlier.size = 0) +
  geom_jitter() +
  theme(plot.title = element_text(hjust=0.5), 
    axis.text.x=element_text(angle=45, hjust=1)) +
  facet_wrap(~ var_grp)

Or if we want to save as separate 'pdf' files或者,如果我们想另存为单独的“pdf”文件

for(nm in names(df)[1:4]) {

p <- ggplot(df,  aes(x=group, y=.data[[nm]] )) +
  geom_boxplot(outlier.size = 0) +
  geom_jitter() +
  ggtitle(nm) +
  xlab("Group") +
  ylab(nm) +
  theme(plot.title = element_text(hjust=0.5), 
      axis.text.x=element_text(angle=45, hjust=1))
  ggsave(file.path(getwd(),  paste0(nm, '.pdf') ), p)
  }

-output -输出

在此处输入图像描述

data数据

df <- structure(list(var1 = c(2.4, 2.7, 7.1, 2.4, 3.4, 3.7, 3, 4.6, 
3, 3, 2.5, 2.5, 2.7, 3.2, 2.5, 2.5, 2.8, 1.3, 2.9, 2.5, 2.4, 
2.3, 2.3, 2.5, 2.6, 3.8, 2.4, 2.5, 7.5, 2.3, 2.2, 2.5, 2.4, 2.9, 
2.5, 2.5, 2.9, 2.6, 2.6, 2.4, 2.5, 2.5, 2.6, 2.9, 2.5, 2.4, 2.8, 
2.6, 2.7, 2.5, 2.7, 3.2), var2 = c(68L, 66L, 73L, 69L, 74L, 73L, 
65L, 76L, 64L, 62L, 72L, 74L, 34L, 74L, 72L, 69L, 63L, 52L, 70L, 
69L, 73L, 45L, 66L, 31L, 68L, 49L, 66L, 76L, 80L, 70L, 53L, 42L, 
66L, 59L, 24L, 59L, 68L, 64L, 58L, 71L, 59L, 69L, 56L, 65L, 63L, 
66L, 59L, 72L, 56L, 52L, 76L, 80L), var3 = c(310L, 305L, 297L, 
295L, 283L, 310L, 281L, 304L, 311L, 307L, 306L, 309L, 285L, 335L, 
305L, 336L, 289L, 270L, 334L, 308L, 307L, 290L, 283L, 285L, 332L, 
292L, 294L, 314L, 314L, 303L, 426L, 295L, 320L, 289L, 280L, 281L, 
305L, 267L, 283L, 290L, 302L, 306L, 297L, 302L, 302L, 306L, 293L, 
306L, 293L, 292L, 313L, 299L), var4 = c(0.6192056, 0.7287633, 
0.5105544, 0.6198897, 0.6399971, 0.7949742, 0.7783598, 0.6377989, 
0.7020734, 0.7437677, 0.6953473, 0.5672044, 0.7941913, 0.6661719, 
0.5500999, 0.8476085, 0.66469, 0.6448988, 0.7123679, 0.5418768, 
0.510849, 0.8393499, 0.5413923, 0.75423, 0.7872316, 0.777534, 
0.6445523, 0.7265084, 0.6255964, 0.448715, 0.870624, 0.224324, 
0.5563342, 0.6899643, 0.7351417, 0.629549, 0.6660455, 0.5632927, 
0.6810814, 0.6878466, 0.6488055, 0.6815277, 0.5262509, 0.6239796, 
0.7206896, 0.5208803, 0.7122809, 0.7217113, 0.7916376, 0.7499101, 
0.6795473, 0.2475061), group = c("group1", "group1", "group1", 
"group1", "group1", "group1", "group1", "group1", "group1", "group1", 
"group1", "group1", "group1", "group2", "group2", "group2", "group2", 
"group2", "group2", "group2", "group2", "group2", "group2", "group2", 
"group2", "group2", "group3", "group3", "group3", "group3", "group3", 
"group3", "group3", "group3", "group3", "group3", "group3", "group3", 
"group3", "group4", "group4", "group4", "group4", "group4", "group4", 
"group4", "group4", "group4", "group4", "group4", "group4", "group4"
)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
"17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", 
"28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", 
"39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", 
"50", "51", "52"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将数据框中的每列转换为单独的数据框 - Convert each column in dataframe to separate dataframe 数据集每一列的分隔箱线图 - Separated Boxplots for each column of a dataset 将包含列表的数据框列中的每个单元格转换为数据框中的一行 - Converting each cell in a dataframe column, containing a list, into a row in the dataframe 如何更改包含数据框中每个要素因素的列? - How to mutate a column containing the factor for each feature in dataframe? 将一个 dataframe 的每一列与另一个 dataframe 列进行比较,并将每个结果重叠打印到单独的文件中 - Compare each column of one dataframe with another dataframe column and print each resulting overlap to separate files 使用R来操作数据框:一列的每一行来分隔列 - Using R to manipulate dataframe: each row of a column to separate columns 为数据框中的每一列在列之后命名输入和输出变量 - Naming input and output variables after a column for each column in a a dataframe 尝试从包含每个文件夹单独列的文件列表(在嵌套目录中)创建数据框 - Trying to create dataframe from list of files (in nested directories) containing separate columns for each folder 根据 R dataframe 中的特定列为 geom_boxplots 着色 - Color geom_boxplots based on a specific column in a R dataframe 如何在r中分离箱线图? - How to separate boxplots in r?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM