简体   繁体   English

在 R 中围绕统计测试编写函数

[英]Writing function around statistical tests in R

I'm writing a function for my (working) R script in order to clean up my code.我正在为我的(工作)R 脚本编写一个函数以清理我的代码。 I do not have experience with writing functions, but decided I should invest some time into this.我没有编写函数的经验,但决定我应该投入一些时间。 The goal of my function is to perform multiple statistical tests while only passing the required dataframe, quantitative variable and grouping variable once .我的功能的目标是执行多个统计测试,同时只通过一次所需的数据帧、定量变量和分组变量。 However, I cannot get this to work.但是,我无法让它发挥作用。 For your reference, I'll use the ToothGrowth data frame to illustrate my problem.供您参考,我将使用 ToothGrowth 数据框来说明我的问题。

Say I want to run a Kruskal-Wallis test and one-way ANOVA on len , to compare different groups named supp , for whatever reason.假设我想在len上运行 Kruskal-Wallis 测试和单向方差分析,以比较名为supp的不同组,无论出于何种原因。 I can do this separately with我可以单独执行此操作

kruskal.test(len ~ supp, data = ToothGrowth)
aov(len ~ supp, data = ToothGrowth)

Now I want to write a function that performs both tests.现在我想编写一个执行这两个测试的函数。 This is what I had thought should work:这是我认为应该工作的:

stat_test <- function(mydata, quantvar, groupvar) {
  kruskal.test(quantvar ~ groupvar, data = mydata)
  aov(quantvar ~ groupvar, data = mydata)
}

But if I then run stat_test(ToothGrowth, "len", "sup") , I get the error但是如果我然后运行stat_test(ToothGrowth, "len", "sup") ,我会收到错误

Error in kruskal.test.default("len", "supp") : 
  all observations are in the same group 

What am I doing wrong?我究竟做错了什么? Any help would be much appreciated!任何帮助将非常感激!

It looks like you need to convert your variable arguments, given as text strings, into a formula.看起来您需要将作为文本字符串给出的变量参数转换为公式。 You can do this by concatenating the strings with paste() .您可以通过使用paste()连接字符串来做到这一点。 Also, you will need to wrap print() around both of your statistical tests within the function, otherwise only the last one will display.此外,您需要将print()包裹在函数内的两个统计测试中,否则只会显示最后一个。

Here is the modified function:这是修改后的功能:

stat_test <- function(mydata, quantvar, groupvar) {
  model_formula <- formula(paste(quantvar, '~', groupvar))
  print(kruskal.test(model_formula, data = mydata))
  print(aov(model_formula, data = mydata))
}

You can use deparse(substitute(quantvar)) to get the quoted name of the column you are passing to the function, and this will allow you to build a formula using paste .您可以使用deparse(substitute(quantvar))获取要传递给函数的列的引用名称,这将允许您使用paste构建公式。 This is a more idiomatic way of operating in R.这是在 R 中更惯用的操作方式。

Here's a reproducible example:这是一个可重现的示例:

stat_test <- function(mydata, quantvar, groupvar) {
  A <- as.formula(paste(deparse(substitute(quantvar)), "~", 
                        deparse(substitute(groupvar))))
  print(kruskal.test(A, data = mydata))
  cat("\n--------------------------------------\n\n")
  aov(A, data = mydata)
}

stat_test(ToothGrowth, len, supp)
#> 
#>  Kruskal-Wallis rank sum test
#> 
#> data:  len by supp
#> Kruskal-Wallis chi-squared = 3.4454, df = 1, p-value = 0.06343
#> 
#> 
#> --------------------------------------
#> Call:
#>    aov(formula = A, data = mydata)
#> 
#> Terms:
#>                     supp Residuals
#> Sum of Squares   205.350  3246.859
#> Deg. of Freedom        1        58
#> 
#> Residual standard error: 7.482001
#> Estimated effects may be unbalanced

Created on 2020-03-30 by the reprex package (v0.3.0)reprex 包于 2020-03-30 创建 (v0.3.0)

For reference, if using rstatix (tidy version of statistical functions), you need to use sym and !!作为参考,如果使用 rstatix(统计函数的整洁版),则需要使用sym!! , while using formula() when needed. ,而在需要时使用formula()

make_kruskal_test <- function(data, quantvar, groupvar) {
  library(rstatix, quietly = TRUE)
  library(rlang, quietly = TRUE)

  formula_expression <- formula(paste(quantvar, "~", groupvar))
  quantvar_sym <- sym(quantvar)

  shapiro <- shapiro_test(data, !!quantvar_sym) %>% print()
}

sample_data <- tibble::tibble(sample = letters[1:5], mean = 1:5)

make_kruskal_test(sample_data, "mean", "sample")

#> # A tibble: 1 x 3
#>   variable statistic     p
#>   <chr>        <dbl> <dbl>
#> 1 mean         0.987 0.967

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM