简体   繁体   English

在R中收集多个列,然后创建Boxplot

[英]Gather multiple columns in R, then create Boxplot

I´m pretty new to R, and got stuck trying to make a boxplot. 我是R的新手,并试图制作一个箱形图。 I have a dataset with 18 variables and almost 20 000 rows. 我有一个包含18个变量和近20 000行的数据集。 Looks something like this like this: 看起来像这样:

1 EnsemblID GeneName   Sample1A   Sample1B   Sample2A   Sample2B       
2 ENSG00000180096   ABCD   0.000000   0.378398   0.256493   0.488798   
3 ENSG00000247626   ACED  20.770695  17.456049  19.066029  17.960966  

I want to make a boxplot per gene (column GeneName), with the values from sample 1 in one box (1A, 1B) and sample 2 in a different box (2A, 2B). 我想为每个基因制作一个boxplot(列GeneName),样本1的值在一个框(1A,1B)中,样本2在不同的框中(2A,2B)。 In reality I have three groups with 5-6 replicates each. 实际上,我有三组,每组5-6个重复。 How do I melt into a tall dataframe like this? 如何融入像这样的高数据帧?

1 GeneName Group Value  
2 ABCD Sample1A 0.000000  
3 ABCD Sample1B 0.378398  
4 ABCD Sample2A 0.256493  
5 ABCD Sample2B 0.488798    
6 ACED Sample1A 20.770695   
7 ACED Sample1B 17.456049    
8 ACED Sample2A 19.066029  
9 ACED Sample2B 17.960966

And how can I make a boxplot to show the variation for each gene within and between each group? 我怎样才能制作一个箱线图来显示每组内和每组之间每个基因的变异?

Would appreciate any help! 非常感谢任何帮助! Thanks! 谢谢!

using tidyr from the tidyverse 使用tidyrtidyverse

df <- read.table(header = T, text ="
1 EnsemblID GeneName Sample1A Sample1B Sample2A Sample2B
2 ENSG00000180096 ABCD 0.000000 0.378398 0.256493 0.488798
3 ENSG00000247626 ACED 20.770695 17.456049 19.066029 17.960966
")

library(tidyr)

df <- gather(df, Group, Value, Sample1A:Sample2B)
boxplot(Value ~ Group, df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM