简体   繁体   English

如何在 R 中编写 function 以接受 dplyr 等列名?

[英]How can I write a function in R which accepts column names like dplyr?

I am writing a package with several functions that accept a dataframe object as well as the the dataframe's column names as arguments.我正在编写一个 package ,其中有几个函数接受 dataframe object 以及数据帧的列名 297DBC11FBZABDEDA8 。

Here is a simplified example:这是一个简化的示例:

func = function(df,vars){
    head(df[,vars])
}

#column args as strings
func(mtcars,c("mpg","cyl"))

Instead of supplying the column names as strings, I would like the function to accept (and suggest/auto-complete) the column names like in dplyr functions.我不希望将列名作为字符串提供,而是希望 function 接受(并建议/自动完成)像 dplyr 函数中的列名。

#dplyr-style args
func(mtcars, mpg, cyl)

#which doesnt work because mpg and cyl don't exist as objects

I considered using the ... as function arguments but this would still involve using strings.我考虑使用...作为 function arguments 但这仍然涉及使用字符串。

Any help would be appreciated.任何帮助,将不胜感激。

A possible solution, using dplyr :一个可能的解决方案,使用dplyr

library(dplyr)

func = function(df,...){
  df %>% 
    select(...) %>% 
    head
}


func(mtcars, mpg, cyl)
#>                    mpg cyl
#> Mazda RX4         21.0   6
#> Mazda RX4 Wag     21.0   6
#> Datsun 710        22.8   4
#> Hornet 4 Drive    21.4   6
#> Hornet Sportabout 18.7   8
#> Valiant           18.1   6

func(mtcars, mpg)

#>                    mpg
#> Mazda RX4         21.0
#> Mazda RX4 Wag     21.0
#> Datsun 710        22.8
#> Hornet 4 Drive    21.4
#> Hornet Sportabout 18.7
#> Valiant           18.1

Or in base R :或者在base R中:

func = function(df,...){
  head(df[, sapply(substitute(...()), deparse)])
}

func(mtcars, mpg, cyl)
#>                    mpg cyl
#> Mazda RX4         21.0   6
#> Mazda RX4 Wag     21.0   6
#> Datsun 710        22.8   4
#> Hornet 4 Drive    21.4   6
#> Hornet Sportabout 18.7   8
#> Valiant           18.1   6

func(mtcars, mpg)

#> [1] 21.0 21.0 22.8 21.4 18.7 18.1

You can use您可以使用

subset(df, select = item)

You should check out Advanced R by Hadley Wickham which is extremely interesting, if somewhat, well, advanced.您应该查看Hadley Wickham 的 Advanced R,这非常有趣,如果有点,那么,高级。 In particular:尤其是:

20.4 Data masks 20.4 数据掩码

In this section, you'll learn about the data mask, a data frame where the evaluated code will look first for variable definitions.在本节中,您将了解数据掩码,这是一个数据框,评估代码将首先在其中查找变量定义。 The data mask is the key idea that powers base functions like with(), subset() and transform(), and is used throughout the tidyverse in packages like dplyr and ggplot2.数据掩码是为 with()、subset() 和 transform() 等基本函数提供支持的关键思想,并在整个 tidyverse 中用于 dplyr 和 ggplot2 等包中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM