[英]Nonparametric test to compare rows in different dataframes in R
This is my first post here.这是我在这里的第一篇文章。
I have 4 dataframes for which I would like to do stepwise nonparametric tests for each row.我有 4 个数据帧,我想对每一行进行逐步非参数测试。
Eg.例如。 I would like to compare the values for each row in dataframe A with the values for each row in dataframe B.
我想将数据帧 A 中每一行的值与数据帧 B 中每一行的值进行比较。
I would need a non parametric test eg.我需要一个非参数测试,例如。 Wilcoxon or whatever.
威尔科克森什么的。
I thought of making a new column with the median, but I am certain that there is something better.我想用中位数创建一个新列,但我确信有更好的东西。
Could you give me an idea how to do this?你能给我一个想法如何做到这一点吗?
Thank you in advance!先感谢您!
Edit: Here are my imaginary dataframes.编辑:这是我想象的数据框。
I want to compare each dataframe row-wise eg do a nonparametric test for John in dataframes A and B, then for Dora, etc.我想逐行比较每个数据帧,例如在数据帧 A 和 B 中对 John 进行非参数检验,然后对 Dora 等进行非参数检验。
A <- data.frame("A" = c("John","Dora","Robert","Jim"),
"A1" = c(8,1,10,5),
"A2"= c(9,1,1,4))
B <- data.frame("B" = c("John","Dora","Robert","Jim"),
"B1" = c(1,1,1,5),
"B2"= c(3,2,1,5),
"B3"=c(4,3,1,5),
"B4"=c(6,8,8,1))
I think you are looking for the function wilcox.test
(in stats
package).我认为您正在寻找函数
wilcox.test
(在stats
包中)。
Solution 1: Using a for loop
解决方案 1:使用
for loop
One way to compare each row of A with the corresponding row of B (and extract the p value) is to create a for loop
such as this:将 A 的每一行与 B 的对应行进行比较(并提取 p 值)的一种方法是创建一个
for loop
,如下所示:
pval = NULL
for(i in 1:nrow(A))
{
vec_a = as.numeric(A[i,2:ncol(A)])
vec_b = as.numeric(B[B$B == A$A[i],2:ncol(B)])
p <- wilcox.test(vec_a,vec_b)
pval = c(pval, p$p.value)
print(p)
}
At the end, you will get a vector pval
containing the pvalue for each row.最后,您将获得一个包含每行 pvalue 的向量
pval
。
pval
[1] 0.1333333 0.2188194 0.5838824 1.0000000
Solution 2: Using tidyverse
解决方案 2:使用
tidyverse
A more elegant solution is to have the use of the tidyverse
packages (in particular dplyr
and tidyr
) to assemble your dataframe into a single one, and compare each name by group by passing a formula in the function wilcox.test
.一个更优雅的解决方案是使用
tidyverse
包(特别是dplyr
和tidyr
)将您的数据帧组装成一个单一的数据帧,并通过在函数wilcox.test
传递一个公式来按组比较每个名称。
First, we can merge your dataframes by their name using left_join
function from dplyr
:首先,我们可以使用他们的名字合并您dataframes
left_join
从功能dplyr
:
library(dplyr)
DF <- left_join(A,B, by = c("A"="B"))
A A1 A2 B1 B2 B3 B4
1 John 8 9 1 3 4 6
2 Dora 1 1 1 2 3 8
3 Robert 10 1 1 1 1 8
4 Jim 5 4 5 5 5 1
Then using dplyr
and tidyr
packages, you can reshape your dataframe into a longer format:然后使用
dplyr
和tidyr
包,您可以将数据帧重塑为更长的格式:
library(dplyr)
library(tidyr)
DF %>% pivot_longer(., -A, names_to = "var", values_to = "values")
# A tibble: 24 x 3
A var values
<fct> <chr> <dbl>
1 John A1 8
2 John A2 9
3 John B1 1
4 John B2 3
5 John B3 4
6 John B4 6
7 Dora A1 1
8 Dora A2 1
9 Dora B1 1
10 Dora B2 2
# … with 14 more rows
We will create a new column "group" that will indicate A or B depending of values in the column var:我们将创建一个新列“组”,根据列 var 中的值指示 A 或 B:
DF %>% pivot_longer(., -A, names_to = "var", values_to = "values") %>%
mutate(group = gsub("\\d","",var))
# A tibble: 24 x 4
A var values group
<fct> <chr> <dbl> <chr>
1 John A1 8 A
2 John A2 9 A
3 John B1 1 B
4 John B2 3 B
5 John B3 4 B
6 John B4 6 B
7 Dora A1 1 A
8 Dora A2 1 A
9 Dora B1 1 B
10 Dora B2 2 B
# … with 14 more rows
Finally, we can group by A
and summarise the dataframe to get the p value of the function wilcox.test
when comparing values in each group for each name:最后,我们可以按
A
分组并汇总数据帧, wilcox.test
在比较每个名称的每个组中的值时获得函数wilcox.test
的 p 值:
DF %>% pivot_longer(., -A, names_to = "var", values_to = "values") %>%
mutate(group = gsub("\\d","",var)) %>%
group_by(A) %>%
summarise(Pval = wilcox.test(values~group)$p.value)
# A tibble: 4 x 2
A Pval
<fct> <dbl>
1 Dora 0.219
2 Jim 1
3 John 0.133
4 Robert 0.584
It looks longer (especially because I explain each steps) but at the end, you can see that we need fewer lines than the first solution.它看起来更长(特别是因为我解释了每个步骤)但最后,您可以看到我们需要比第一个解决方案更少的行。
Does it answer your question ?它回答你的问题吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.