简体   繁体   English

在R中一次将数据框中的多列传递给函数

[英]Pass multiple columns in dataframe into function at once in R

After much searching, I can't seem to figure this out. 经过大量搜索,我似乎无法弄清楚。 Trying to write a function that: 尝试编写一个函数:

  • takes a data frame, db 需要一个数据帧,db
  • groups the data frame by var1 按var1分组数据帧
  • returns the mean and sd by group on several different columns 在几个不同的列上按组返回均值和sd

Here is my function, 这是我的职能,

myfun <- function(db,var1, ...) {

  var1 <- enquo(var1)
  var2 <- quos(...)

  for (i in var2) {

  db %>% 
    group_by(!!var1) %>%       
    summarise(mean_var = mean(!!!var2))

}}

when I pass the following, nothing returns 当我通过以下内容时,没有任何回报

myfun(data, group, age, bmi)

Ideally, I would like to group both age and bmi by group and return the mean and sd for each. 理想情况下,我希望将age和bmi分组,并分别返回其均值和sd。 In the future, I would like to pass many more columns from data into the function... 将来,我想将更多列从数据传递到函数中。

The output would be similar to summaryBy from doby package, but on many columns at once and would look like: 输出将与doby包中的summaryBy相似,但同时在许多列上将显示为:

Group   age.mean    age.sd
0
1
        bmi.mean    bmi.sd
0
1

Your loop appears to be unnecessary (you aren't doing anything with i ). 您的循环似乎是不必要的(您没有对i进行任何操作)。 Instead, you could use summarize_at to achieve the effect you want: 相反,您可以使用summarize_at达到所需的效果:

myfun <- function(db,var1, ...) {

  var1 <- enquo(var1)
  var2 <- quos(...)

    db %>% 
      group_by(!!var1) %>%       
      summarise_at(vars(!!!var2), c(mean = mean, sd = sd))

  }

And if we test it out with diamonds dataset: 如果我们用钻石数据集进行测试:

myfun(diamonds, cut, x, z)

  cut       x_mean z_mean  x_sd  z_sd
  <ord>      <dbl>  <dbl> <dbl> <dbl>
1 Fair        6.25   3.98 0.964 0.652
2 Good        5.84   3.64 1.06  0.655
3 Very Good   5.74   3.56 1.10  0.730
4 Premium     5.97   3.65 1.19  0.731
5 Ideal       5.51   3.40 1.06  0.658

To get the formatting closer to what you had in mind in your original post, we can use a bit of tidyr magic: 为了使格式更接近您在原始帖子中所想到的格式,我们可以使用一些tidyr魔术:

myfun <- function(db,var1, ...) {

  var1 <- enquo(var1)
  var2 <- quos(...)

  db %>% 
    group_by(!!var1) %>%       
    summarise_at(vars(!!!var2), c(mean = mean, sd = sd)) %>% 
    gather(variable, value, -(!!var1)) %>% 
    separate(variable, c('variable', 'measure'), sep = '_') %>% 
    spread(measure, value) %>% 
    arrange(variable, !!var1)

}

   cut       variable  mean    sd
   <ord>     <chr>    <dbl> <dbl>
 1 Fair      x         6.25 0.964
 2 Good      x         5.84 1.06 
 3 Very Good x         5.74 1.10 
 4 Premium   x         5.97 1.19 
 5 Ideal     x         5.51 1.06 
 6 Fair      z         3.98 0.652
 7 Good      z         3.64 0.655
 8 Very Good z         3.56 0.730
 9 Premium   z         3.65 0.731
10 Ideal     z         3.40 0.658

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R - 一次替换和删除 dataframe 或多个列中的第一个和最后一个百分位数 - R - replace and delete first and last percentile in dataframe or multiple columns at once 一次在 sql 表的多列中插入 R 代码输出(数据帧) - Inserting R code output(dataframe )in multiple columns of sql table at once 将 function 应用于 R 中的多个 dataframe 列 - Applying function to multiple dataframe columns in R 如何将多个列从 dataframe 作为个人 arguments 传递到 R 中的自定义 function - How do I pass multiple columns from a dataframe as individual arguments to a custom function in R R:解压缩将多个对象返回到数据框的多个列的函数 - R: Unpack a function that returns multiple objects to multiple columns of a dataframe 在R中一次拆分多列 - Splitting multiple columns at once in R 在 R 中一次重新缩放多列 - rescale multiple columns at once in R 从具有未知列数的数据框中传递多个参数到函数 - Pass multiple arguments to function from dataframe with unknown number of columns 如何在R中的函数中传递多个条件并返回数据帧? - how to pass multiple conditions in a function in R and returns a dataframe? R:如何应用为多列输出数据帧的函数(使用dplyr)? - R: How to apply a function that outputs a dataframe for multiple columns (using dplyr)?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM