简体   繁体   English

R For循环执行Fisher测试 - 错误消息

[英]R For loop to perform Fisher's test - Error message

My data frame looks like that: 我的数据框看起来像这样:

595.00000    18696      984.00200     32185    Group1  
935.00000    18356      1589.00000    31580    Group2            
40.00010     19251      73.00000      33096    Group3            
1058.00000   18233      1930.00000    31239    Group4                
19.00000     19272      27.00000      33142    Group5            
1225.00000   18066      2149.00000    31020    Group6  
....                 

For every group I want to do Fisher exact test. 对于我想做Fisher精确测试的每一组。

table <- matrix(c(595.00000, 984.00200, 18696, 32185), ncol=2, byrow=T)  
Group1 <- Fisher.test(table, alternative="greater")

Tried to loop over the data frame with: 试图循环数据框:

for (i in 1:nrow(data.frame))  
 {  
 table= matrix(c(data.frame$V1, data.frame$V2, data.frame$V3, data.frame$V4), ncol=2, byrow=T)    
fisher.test(table, alternative="greater")  
}

But got error message 但得到了错误信息

Error in fisher.test(table, alternative = "greater") :  
FEXACT error 40.  
Out of workspace.  
In addition: Warning message:  
In fisher.test(table, alternative = "greater")  :  
'x' has been rounded to integer: Mean relative difference: 2.123828e-06

How can I fix this problem or maybe do another way of looping over the data? 如何解决此问题或者可能采用其他方式循环数据?

Your first error is: Out of workspace 您的第一个错误是: Out of workspace

?fisher.test
fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
        control = list(), or = 1, alternative = "two.sided",
        conf.int = TRUE, conf.level = 0.95,
        simulate.p.value = FALSE, B = 2000)

You should try increasing the workspace (default = 2e5). 您应该尝试增加workspace (默认值= 2e5)。

However, this happens in your case because you have really huge values. 但是,这种情况发生在您的情况下,因为您确实有巨大的价值。 As a rule of thumb, if all elements of your matrix are > 5 (or in your case 10, because df = 1), then you can safely approximate it with a chi-square test of independence using chisq.test . 根据经验,如果矩阵的所有元素都大于5(或者在你的情况下是10,因为df = 1),那么你可以使用chisq.test通过卡方检验的独立性来安全地近似它。 For your case, I think you should rather use a chisq.test . 对于你的情况,我认为你应该使用chisq.test

And the warning message happens because your values are not integers (595.000) etc. So, if you really want to use a fisher.test recursively, do this (assuming your data is in df and is a data.frame : 并且出现warning message是因为您的值不是整数(595.000)等。因此,如果您真的想以递归方式使用fisher.test ,请执行此操作(假设您的数据位于df并且是data.frame

# fisher.test with bigger workspace
apply(as.matrix(df[,1:4]), 1, function(x) 
         fisher.test(matrix(round(x), ncol=2), workspace=1e9)$p.value)

Or if you would rather substitute with a chisq.test (which I think you should for these huge values for performance gain with out no significant differences in p-values): 或者,如果您希望用chisq.test (我认为您应该将这些巨大的值用于性能增益,而p值没有显着差异):

apply(as.matrix(df[,1:4]), 1, function(x) 
         chisq.test(matrix(round(x), ncol=2))$p.value)

This will extract the p-values. 这将提取p值。

Edit 1: I just noticed that you use one-sided Fisher's exact test . 编辑1:我刚注意到你使用one-sided Fisher's exact test Maybe you should continue using Fisher's test with bigger workspace as I'm not sure of having a one-sided chi-square test of independence as it is already calculated from the right-tail probability (and you can not divide the p-values by 2 as its unsymmetrical). 也许你应该继续使用Fisher测试更大的工作空间,因为我不确定是否有单侧卡方检验的独立性,因为它已经right-tail概率计算出来了(你不能将p值除以2作为其不对称)。

Edit 2: Since you require the group name with the p-values and you already have a data.frame, I suggest you use data.table package as follows: 编辑2:由于您需要具有p值的组名,并且您已经有data.frame,我建议您使用data.table包,如下所示:

# example data
set.seed(45)
df <- as.data.frame(matrix(sample(10:200, 20), ncol=4))
df$grp <- paste0("group", 1:nrow(df))
# load package
require(data.table)
dt <- data.table(df, key="grp")
dt[, p.val := fisher.test(matrix(c(V1, V2, V3, V4), ncol=2), 
                workspace=1e9)$p.value, by=grp]
> dt
#     V1  V2  V3  V4    grp        p.val
# 1: 130  65  76  82 group1 5.086256e-04
# 2:  70  52 168 178 group2 1.139934e-01
# 3:  55 112 195  34 group3 7.161604e-27
# 4:  81  43  91  80 group4 4.229546e-02
# 5:  75  10  86  50 group5 4.212769e-05

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM