繁体   English   中英

在公式中将列名称传递为 function arguments

[英]Pass column names as function arguments in formula

我想创建一个可重复使用的 function 进行重复 t 检验,以便可以将列名传递到公式中。 但是,我找不到让它工作的方法。 所以下面的代码就是这个想法:

library(dplyr)
library(rstatix)
do.function <- function(table, column, category) {
  column = sym(column)
  category = sym(category)
  
  stat.test <- table %>%
    group_by(subset) %>%
    t_test(column ~ category)
  
  return(stat.test)
}
tmp = data.frame(id=seq(1:100), value = rnorm(100), subset = rep(c("Set1", "Set2"),each=50,2),categorical_value= rep(c("A", "B"),each=25,4))
do.function(table= tmp, column = "value", category = "categorical_value")

我得到的当前错误如下:

Error: Can't extract columns that don't exist.
x Column `category` doesn't exist.
Run `rlang::last_error()` to see where the error occurred. 

问题是是否有人知道如何解决这个问题?

只需制作一个公式,而不是将它们包装在sym中:

library(dplyr)
library(rstatix)
do.function <- function(table, column, category) {
  formula <- paste0(column, '~', category) %>% 
    as.formula()
  
  table %>%
    group_by(subset) %>%
    t_test(formula)
}
tmp = data.frame(id=seq(1:100), value = rnorm(100), subset = rep(c("Set1", "Set2"),each=50,2),categorical_value= rep(c("A", "B"),each=25,4))
do.function(table= tmp, column = "value", category = "categorical_value")
# A tibble: 2 x 9
  subset .y.   group1 group2    n1    n2 statistic    df     p
* <chr>  <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl>
1 Set1   value A      B         50    50     0.484  94.3 0.63 
2 Set2   value A      B         50    50    -2.15   97.1 0.034

当我们传递字符串值时,我们可以只使用reformulate来创建公式中的表达式

do.function <- function(table, column, category) {
  
  
  stat.test <- table %>%
    group_by(subset) %>%
    t_test(reformulate(category, response = column ))
  
  return(stat.test)
}

-测试

> do.function(table= tmp, column = "value", category = "categorical_value")
# A tibble: 2 × 9
  subset .y.   group1 group2    n1    n2 statistic    df      p
* <chr>  <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>  <dbl>
1 Set1   value A      B         50    50     1.66   97.5 0.0993
2 Set2   value A      B         50    50     0.448  92.0 0.655 

公式实际上已经在rstatix::t_test中使用,我们通过它们的名称get变量。

do.function <- function(table, column, category) {
  stat.test <- table  %>%
    mutate(column=get(column), 
           category=get(category)) %>%
    rstatix::t_test(column ~ category)
  return(stat.test)
}

do.function(table=tmp, column="value", category="categorical_value")
# # A tibble: 1 × 8
# .y.    group1 group2    n1    n2 statistic    df     p
# * <chr>  <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl>
# 1 column A      B        100   100     0.996  197.  0.32

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM