简体   繁体   English

rstatix::wilcox_test 不将公式作为变量

[英]rstatix::wilcox_test doesn't take formula as variable

I am trying to do boxplots and add p-values from rstatix::wilcox_test() .我正在尝试绘制箱线图并从rstatix::wilcox_test()添加 p 值。 But I want to do this in a loop.但我想循环执行此操作。 so the column names for the formula for wilcox_test is stored in a variable.所以wilcox_test公式的列名存储在一个变量中。 When I use the variable name in the wilcox_test, it throws error当我在 wilcox_test 中使用变量名时,它会抛出错误

Error in `pull()`:
! Can't extract columns that don't exist.
✖ Column `rs_id` doesn't exist.

Can someone please help me how to use variables in the formula for wilcox_test()有人可以帮我如何在 wilcox_test() 的公式中使用变量吗

here is my code这是我的代码

rs_ids <- c('rs2160669', 'rs964184')
phenos <- c('HBA1C', 'TG')
for (rs_id in rs_ids){
  for(phen in phenos){
    bxp <- ggboxplot(d.k, x = rs_id, y = phen)
    stat.test <- d.k %>%  wilcox_test(.data[[`phen`]] ~ .data[[`rs_id`]]) %>%  add_significance()
    stat.test <- stat.test %>% add_xy_position(x = .data[[rs_id]]) %>% mutate(myformatted.p = ifelse(p < 0.001, 'p<0.001', paste0('p=',signif(p, digits = 2))))
  
    bxp <-  bxp +   geom_jitter(position=position_jitter(0.2), alpha = 0.5)   +  theme_bw() + theme(legend.position="none", axis.text=element_text(size=12), axis.title=element_text(size=14,face="bold")) + xlab(rs_id) + ylab(phen)
    plot <- bxp + stat_pvalue_manual(stat.test, label = "{myformatted.p}", tip.length = 0.01, step.increase = 0.1)
  }
}

sample data is below示例数据如下

structure(list(HBA1C = c(4.5, 5.9, 8.02, 5.5, 5.4, 6, 7.1, 9.9, 
5.58, 5, 7.6, 8, 5.5, 6.4, 6.8, 9, 6.2, 4.9, 5.7, 6.2, 6.3, 5.7, 
5.9, 6.9, 7.3, 5.7, 7.2, 10.4, 5.4, 5.3, 4.8, 5.5, 7.1, 6.2, 
7, 7.4, 11.4, 5.4, 5.5, 5.3, 5.5, 5.9, 5.5, 5.79, 4.8, 6.16, 
5.63, 7.41, 7.68, 5.82), TG = c(0.83, 0.95, 2.48, 0.78, 0.48, 
1.16, 1.45, 1.32, 1.03, 3.77, 3.87, 2.34, 1.98, 2.59, 2.92, 2.71, 
6.88, 1.25, 3.68, 1.15, 2.33, 3.37, 0.61, 1.02, 1.63, 1.32, 1.21, 
1.35, 0.85, 3.97, 4.04, 3.63, 2.62, 1.46, 2.33, 2.46, 1.09, 1.46, 
2.77, 3.1, 3.13, 2.55, 1.91, 0.97, 0.87, 1.46, 1.45, 1.15, 2.61, 
2.15), rs2160669 = c(3, 2, 3, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 
1, 2, 3, 3, 3, 3, 2, 2, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 
2, 3, 3, 3, 2, 3, NA, 3, 3, 3, 3, 3, 3, 2, 2), rs964184 = c(1, 
2, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 1, 2, 3, 2, 1, 1, 1, 1, 2, 2, 
1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, NA, 
2, 1, 3, 1, 1, 2, 2, 2)), row.names = c(NA, -50L), class = c("tbl_df", 
"tbl", "data.frame"))

A few issues here:这里有几个问题:

  1. Backticks are wrong, even in dplyr verbs that recognize .data you'd need to use .data[[phen]] and .data[[rs_id]] .反引号是错误的,即使在识别.data的 dplyr 动词中,您也需要使用.data[[phen]].data[[rs_id]] (Cf, dk %>% count(.data[[phen]]) ). (Cf, dk %>% count(.data[[phen]]) )。

  2. .data can only be used in tidyverse-aware functions (eg, dplyr), it won't work in wilcox_test . .data只能用于 tidyverse-aware 函数(例如,dplyr),它在wilcox_test中不起作用。

I suggest you make the formula as a string and then as.formula it.我建议您将公式设为字符串,然后将其as.formula This will work in the %>% -pipe since the first argument to the test is data= , so the formula (as we form it) will be the second arg, formula= .这将在%>%管道中工作,因为测试的第一个参数是data= ,所以公式(正如我们形成的那样)将是第二个参数, formula=

rs_id <- "rs2160669"
phen <- "HBA1C"
as.formula(paste(phen, rs_id, sep = "~"))
# HBA1C ~ rs2160669

library(dplyr) # just for %>% which is magrittr, but I'm inferring your use of dplyr
library(rstatix)
d.k %>%
  wilcox_test(as.formula(paste(phen, rs_id, sep = "~")))
# # A tibble: 3 x 9
#   .y.   group1 group2    n1    n2 statistic     p p.adj p.adj.signif
# * <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>       
# 1 HBA1C 1      2          1    12         7 0.923  1    ns          
# 2 HBA1C 1      3          1    36        25 0.542  1    ns          
# 3 HBA1C 2      3         12    36       280 0.13   0.39 ns          

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM