简体   繁体   English

按行进行单因素方差分析

[英]One-way ANOVA analysis by row

I would like to do an ANOVA analysis to compare the differences between the 4 groups (ABCD).我想做一个方差分析来比较4组(ABCD)之间的差异。 Each group contains an uneven number of replicates in different columns, and each row represents one individual item I would like to test.每组在不同的列中包含奇数个重复,每一行代表我要测试的一个单独的项目。 Each column you can treat as a replicate and there is no relationship between different rows.您可以将每一列视为复制,并且不同行之间没有关系。 Eventually, I hope to figure out what item(row) showed significant differences between the 4 groups.最终,我希望弄清楚 4 组之间哪些项目(行)显示出显着差异。

Please see below example data structure.请参阅下面的示例数据结构。 In reality, all data are normalized already.实际上,所有数据都已经标准化。

A1 A1 A2 A2 A3 A3 B1 B1 B2 B2 C1 C1 C2 C2 D1 D1 D2 D2 D3 D3
protein1蛋白质1 15 15 30 30 28 28 6 6 7 7 9 9 30 30 45 45 66 66 43 43
protein2蛋白质2 2 2 4 4 3 3 56 56 54 54 23 23 25 25 12 12 13 13 5 5
protein3蛋白质3 2 2 4 4 3 3 56 56 54 54 23 23 25 25 12 12 13 13 5 5
protein4蛋白质4 2 2 4 4 3 3 56 56 54 54 23 23 25 25 12 12 13 13 5 5

One way to do this:一种方法:

First reshape the data into a format that the model can handle.首先将数据重塑为 model 可以处理的格式。 This uses the tidyverse package.这使用了 tidyverse package。

df_long <- df %>%
  pivot_longer(cols = 2:ncol(.)) %>%
  pivot_wider(names_from = prot, values_from = value) %>%
  separate(name, into = c("trt"), sep = "\\d")

Which looks like:看起来像:

   trt   protein1 protein2 protein3 protein4
   <chr>    <dbl>    <dbl>    <dbl>    <dbl>
 1 A           15        2        2        2
 2 A           30        4        4        4
 3 A           28        3        3        3
 4 B            6       56       56       56
 5 B            7       54       54       54
 6 C            9       23       23       23
 7 C           30       25       25       25
 8 D           45       12       12       12
 9 D           66       13       13       13
10 D           43        5        5        5

Then you can easily use whatever model/statistical test you would like to apply.然后,您可以轻松使用您想应用的任何模型/统计测试。 For example, to generate an ANOVA for each column, you could define a helper function and then map over the columns:例如,要为每列生成 ANOVA,您可以在列上定义一个帮助器 function 和 map:

fit_aov <- function(col) {
  aov(col ~ trt, data = df_long)
}

anovas <- map(df_long[, 2:ncol(df_long)], fit_aov)

summary(anovas$protein2)

            Df Sum Sq Mean Sq F value   Pr(>F)    
trt          3   3648  1216.0   165.8 3.69e-06 ***
Residuals    6     44     7.3                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM