简体   繁体   English

在一处添加多个箱形图

[英]Adding several box plots in one

I have a dataset where I have three different groups of individuals, let´s call them Green, Red, and Blue. 我有一个数据集,其中有三个不同的组,分别称为绿色,红色和蓝色。 Then I have data covering 92 proteins in their blood, from which I have readings for each individual in each group. 然后,我获得了涵盖其血液中92种蛋白质的数据,从中我可以获得每个组中每个个体的读数。

I would like to get a good overview of the variances and means for each protein for each group. 我希望对每组蛋白质的差异和均值有一个很好的了解。 Which means that I would like to make a multiple box plot graph. 这意味着我想制作一个多箱图。

I would like to have the different proteins on the x-axis, and three box plots (preferably in different colors) (one for each group) above every protein, with numeric protein weight on the y-axis. 我想在x轴上放置不同的蛋白质,并在每种蛋白质上方放置三个方框图(最好使用不同的颜色)(每组一个),在y轴上使用数字蛋白质重量。

How do I do this? 我该怎么做呢?

I am currently working with a data frame where the groups are divided by the rows, and the different protein readings is in each column. 我目前正在使用一个数据框,其中各组按行划分,每列中的蛋白质读数不同。

Tried to add a picture, but apparently you need reputation-points… 试图添加图片,但是显然您需要信誉点…

I´ve heard that you can use the melt command in reshape2, but I need guidance in how to use it. 我听说您可以在reshape2中使用melt命令,但是我需要有关如何使用它的指导。

Please, simplify the answers. 请简化答案。 I´m not very experienced when it comes to R. 关于R,我经验不足。

Look, I realize things are frustrating when you are first getting started, but you're going to have to ask specific and targeted questions for people to be willing and able to help you out in a structured way. 看,我意识到当您刚开始使用时,事情会令人沮丧,但是您将不得不问一些具体的针对性问题,以便人们愿意并能够以结构化的方式帮助您。

Having said that, let's walk through a structured example. 话虽如此,让我们来看一个结构化的例子。 I am only going to use 9 proteins here, but you should get the idea. 我只在这里使用9种蛋白质,但是您应该明白这一点。

library(ggplot2)
library(reshape2)

# Setup a data frame, since the question did not provide one...
df <- structure(list(Individual = 1:12, 
                     Group = structure(c(2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L, 2L, 1L, 3L), 
                              .Label = c("Blue", "Green", "Red"), class = "factor"), 
                     Protein_1 = c(82L, 23L, 19L, 100L, 33L, 86L, 32L, 41L, 39L, 59L, 93L, 99L), 
                     Protein_2 = c(86L, 50L, 86L, 90L, 37L, 20L, 26L, 38L, 87L, 81L, 23L, 49L), 
                     Protein_3 = c(81L, 31L, 5L, 10L, 79L, 40L, 27L, 73L, 64L, 30L, 87L, 64L), 
                     Protein_4 = c(52L, 15L, 25L, 12L, 63L, 52L, 60L, 33L, 27L, 32L, 53L, 93L), 
                     Protein_5 = c(19L, 75L, 25L, 14L, 33L, 60L, 73L, 13L, 92L, 92L, 91L, 12L), 
                     Protein_6 = c(33L, 49L, 29L, 58L, 51L, 12L, 61L, 48L, 71L, 18L, 84L, 31L), 
                     Protein_7 = c(84L, 57L, 28L, 99L, 47L, 54L, 72L, 97L, 73L, 46L, 68L, 37L), 
                     Protein_8 = c(15L, 16L, 46L, 95L, 57L, 86L, 30L, 83L, 45L, 12L, 49L, 82L), 
                     Protein_9 = c(84L, 91L, 33L, 10L, 91L, 91L, 4L, 88L, 42L, 82L, 76L, 95L)), 
                .Names = c("Individual", "Group", "Protein_1", "Protein_2", "Protein_3", 
                           "Protein_4", "Protein_5", "Protein_6", "Protein_7", "Protein_8", "Protein_9"), 
                class = "data.frame", row.names = c(NA, -12L))

head(df)
# Individual Group Protein_1 Protein_2 Protein_3 Protein_4 Protein_5 Protein_6 Protein_7 Protein_8 Protein_9
# 1          1 Green        82        86        81        52        19        33        84        15        84
# 2          2  Blue        23        50        31        15        75        49        57        16        91
# 3          3   Red        19        86         5        25        25        29        28        46        33
# 4          4 Green       100        90        10        12        14        58        99        95        10
# 5          5  Blue        33        37        79        63        33        51        47        57        91
# 6          6   Red        86        20        40        52        60        12        54        86        91
?melt
df.melted <- melt(df, id.vars = c("Individual", "Group"))
head(df.melted)
# Individual Group  variable value
# 1          1 Green Protein_1    82
# 2          2  Blue Protein_1    23
# 3          3   Red Protein_1    19
# 4          4 Green Protein_1   100
# 5          5  Blue Protein_1    33
# 6          6   Red Protein_1    86

# First Protein
# Notice I am using subset()
ggplot(data = subset(df.melted, variable == "Protein_1"),
       aes(x = Group, y = value)) + geom_boxplot(aes(fill = Group))

蛋白质1

# Second Protein
ggplot(data = subset(df.melted, variable == "Protein_2"),
       aes(x = Group, y = value)) + geom_boxplot(aes(fill = Group))

蛋白质2

# and so on...

# You could also use facets
ggplot(data = df.melted, aes(x = Group, y = value)) + 
  geom_boxplot(aes(fill = Group)) +
  facet_wrap(~ variable)

面

And yes, I realize that the color groupings do not align with the colors of the plot...I will leave that as an exercise... You have to be willing to tinker, explore, and fail many times. 是的,我意识到颜色分组与图的颜色不匹配...我将把它留作练习...您必须愿意修补,探索和失败很多次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM