[英]One-way ANOVA analysis by row
I would like to do an ANOVA analysis to compare the differences between the 4 groups (ABCD).我想做一个方差分析来比较4组(ABCD)之间的差异。 Each group contains an uneven number of replicates in different columns, and each row represents one individual item I would like to test.每组在不同的列中包含奇数个重复,每一行代表我要测试的一个单独的项目。 Each column you can treat as a replicate and there is no relationship between different rows.您可以将每一列视为复制,并且不同行之间没有关系。 Eventually, I hope to figure out what item(row) showed significant differences between the 4 groups.最终,我希望弄清楚 4 组之间哪些项目(行)显示出显着差异。
Please see below example data structure.请参阅下面的示例数据结构。 In reality, all data are normalized already.实际上,所有数据都已经标准化。
A1 A1 | A2 A2 | A3 A3 | B1 B1 | B2 B2 | C1 C1 | C2 C2 | D1 D1 | D2 D2 | D3 D3 | |
---|---|---|---|---|---|---|---|---|---|---|
protein1蛋白质1 | 15 15 | 30 30 | 28 28 | 6 6 | 7 7 | 9 9 | 30 30 | 45 45 | 66 66 | 43 43 |
protein2蛋白质2 | 2 2 | 4 4 | 3 3 | 56 56 | 54 54 | 23 23 | 25 25 | 12 12 | 13 13 | 5 5 |
protein3蛋白质3 | 2 2 | 4 4 | 3 3 | 56 56 | 54 54 | 23 23 | 25 25 | 12 12 | 13 13 | 5 5 |
protein4蛋白质4 | 2 2 | 4 4 | 3 3 | 56 56 | 54 54 | 23 23 | 25 25 | 12 12 | 13 13 | 5 5 |
One way to do this:一种方法:
First reshape the data into a format that the model can handle.首先将数据重塑为 model 可以处理的格式。 This uses the tidyverse package.这使用了 tidyverse package。
df_long <- df %>%
pivot_longer(cols = 2:ncol(.)) %>%
pivot_wider(names_from = prot, values_from = value) %>%
separate(name, into = c("trt"), sep = "\\d")
Which looks like:看起来像:
trt protein1 protein2 protein3 protein4
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 15 2 2 2
2 A 30 4 4 4
3 A 28 3 3 3
4 B 6 56 56 56
5 B 7 54 54 54
6 C 9 23 23 23
7 C 30 25 25 25
8 D 45 12 12 12
9 D 66 13 13 13
10 D 43 5 5 5
Then you can easily use whatever model/statistical test you would like to apply.然后,您可以轻松使用您想应用的任何模型/统计测试。 For example, to generate an ANOVA for each column, you could define a helper function and then map over the columns:例如,要为每列生成 ANOVA,您可以在列上定义一个帮助器 function 和 map:
fit_aov <- function(col) {
aov(col ~ trt, data = df_long)
}
anovas <- map(df_long[, 2:ncol(df_long)], fit_aov)
summary(anovas$protein2)
Df Sum Sq Mean Sq F value Pr(>F)
trt 3 3648 1216.0 165.8 3.69e-06 ***
Residuals 6 44 7.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.