[英]refer to quoted column name in a function in R
I want to use the na_omit
function from the collapse package in a user-defined function.我想在用户定义的 function 中使用来自折叠package 的
na_omit
function。 na_omit
requires a column name to be in quotes as one of its arguments. na_omit
要求列名用引号括起来作为其 arguments 之一。 If I didn't need the column name in quotes, I could just refer to the column name in double braces, {{col}}
, as mentioned in this vignette, "Programming with dplyr" .如果我不需要引号中的列名,我可以只引用双括号中的列名
{{col}}
,如本小插图“使用 dplyr 编程”中所述。 If I refer to the column using the glue package, such as glue::glue("{col}")
, I receive errors.如果我引用使用胶水package 的列,例如
glue::glue("{col}")
,我会收到错误。
Here is a reprex:这是一个代表:
my_df <-
data.frame(
matrix(
c(
"V9G","Blue",
NA,"Red",
"J4C","White",
NA,"Brown",
"F7B","Orange",
"G3V","Green"
),
nrow = 6,
ncol = 2,
byrow = TRUE,
dimnames = list(NULL,
c("color_code", "color"))
),
stringsAsFactors = FALSE
)
library(collapse)
library(dplyr)
library(glue)
my_func <- function(df, col){
df %>%
collapse::na_omit(cols = c(glue("{col}"))) #Here is the code that fails
}
my_func(my_df, color_code)
The expected output can be generated with the following:可以使用以下命令生成预期的 output:
my_df %>%
collapse::na_omit(cols = c("color_code"))
and should produce:并且应该产生:
# color_code color
#1 V9G Blue
#2 J4C White
#3 F7B Orange
#4 G3V Green
How should I refer to a quoted column name that's a parameter and an argument of a function within a user-defined function in R?我应该如何在 R 中的用户定义的 function 中引用作为参数的引用列名称和 function 的参数?
In general, collapse is mostly standard evaluation and its NSE features are based upon base R, so most of the rlang and glue stuff etc. won't work, but you will have simpler and faster code.一般来说,collapse 主要是标准评估,它的 NSE 功能基于基础 R,因此大多数 rlang 和胶水等内容都不起作用,但您将拥有更简单和更快的代码。 As suggested, for a single column, a solution would be
function(df, col) { col <- as.character(substitute(col)); ...; }
正如建议的那样,对于单个列,解决方案是
function(df, col) { col <- as.character(substitute(col)); ...; }
function(df, col) { col <- as.character(substitute(col)); ...; }
function(df, col) { col <- as.character(substitute(col)); ...; }
ie use substitute()
to capture the expression and as.character
or all.vars
to extract the variables.即使用
substitute()
function(df, col) { col <- as.character(substitute(col)); ...; }
捕获表达式和as.character
或all.vars
来提取变量。 For multiple columns a general solution is wrapping fselect
eg对于多列,一般解决方案是包装
fselect
例如
library(collapse)
my_func <- function(df, ...) {
cols <- fselect(df, ..., return = "indices")
na_omit(df, cols = cols)
}
my_func(wlddev, PCGDP:GINI, POP) |> head()
#> country iso3c date year decade region
#> 1 Albania ALB 1997-01-01 1996 1990 Europe & Central Asia
#> 2 Albania ALB 2003-01-01 2002 2000 Europe & Central Asia
#> 3 Albania ALB 2006-01-01 2005 2000 Europe & Central Asia
#> 4 Albania ALB 2009-01-01 2008 2000 Europe & Central Asia
#> 5 Albania ALB 2013-01-01 2012 2010 Europe & Central Asia
#> 6 Albania ALB 2015-01-01 2014 2010 Europe & Central Asia
#> income OECD PCGDP LIFEEX GINI ODA POP
#> 1 Upper middle income FALSE 1869.866 72.495 27.0 294089996 3168033
#> 2 Upper middle income FALSE 2572.721 74.579 31.7 453309998 3051010
#> 3 Upper middle income FALSE 3062.674 75.228 30.6 354950012 3011487
#> 4 Upper middle income FALSE 3775.581 75.912 30.0 338510010 2947314
#> 5 Upper middle income FALSE 4276.608 77.252 29.0 335769989 2900401
#> 6 Upper middle income FALSE 4413.297 77.813 34.6 260779999 2889104
Created on 2022-02-03 by the reprex package (v2.0.1)由代表 package (v2.0.1) 于 2022 年 2 月 3 日创建
You have to provide col name as a character, like:您必须提供 col 名称作为字符,例如:
my_func <- function(df, col){
df %>%
collapse::na_omit(cols = c(glue("{col}"))) #Here is the code that fails
}
my_func(my_df, col = "color_code")
It's important to first determine what environment in R you're programming in. Are you in dplyr or base R?首先确定您正在编程的 R 中的环境很重要。您是在dplyr还是基础 R 中? If in dplyr , then reference the documentation for programming with dplyr , rlang , glue , and this stackoverflow answer .
如果在dplyr中,请参考使用dplyr 、 rlang 、 glue和这个 stackoverflow 答案进行编程的文档。 If in base R, reference the documentation on non-standard evaluation .
如果在基础 R 中,请参考非标准评估文档。 Reasons for this question comes from environment confusion.
这个问题的原因来自环境混乱。 Here are some of the different approaches in a reprex.
以下是reprex 中的一些不同方法。
Data数据
my_df <-
data.frame(
matrix(
c(
"V9G","Blue",
NA,"Red",
"J4C","White",
NA,"Brown",
"F7B","Orange",
"G3V","Green"
),
nrow = 6,
ncol = 2,
byrow = TRUE,
dimnames = list(NULL,
c("color_code", "color"))
),
stringsAsFactors = FALSE
)
Packages套餐
library(collapse)
library(dplyr)
library(stringr)
library(glue)
Functional Programming in base R基础 R 中的函数式编程
with a quoted column name:带引号的列名:
my_func <- function(df, col) {
col_char_ref <- deparse(substitute(col)) #Use deparse(substitute()) to refer to a quoted column name
df %>%
collapse::na_omit(cols = col_char_ref)
}
my_func(my_df, color_code)
#Should generate output below
my_df %>%
collapse::na_omit(cols = "color_code")
and with a non-quoted column name:并使用未引用的列名:
my_func <- function(df, col){
env <- list2env(df, parent = parent.frame())
col <- substitute(col)
df %>%
collapse::ftransform(count = stringr::str_length(eval(col, env)))
}
my_func(my_df, color)
#Should generate output below
my_df %>%
collapse::ftransform(count = stringr::str_length(color))
Functional programming in dplyr dplyr 中的函数式编程
with a quoted column name using glue and dplyr functions:使用胶水和dplyr函数引用列名:
my_func <- function(df, col1, col2) {
df %>%
mutate(description := glue("color code: {pull(., {{col1}})}; color: {pull(., {{col2}})}"))
}
my_func(my_df, color_code, color)
#Should generate output below
my_df %>%
mutate(description = glue("color code: {color_code}; color: {color}"))
or with a quoted column name using a C language wrapper function:或使用 C 语言包装器 function 使用带引号的列名:
my_func <- function(df, col1, col2) {
df %>%
mutate(description := sprintf("color code: %s; color: %s", {{col1}}, {{col2}}))
}
my_func(my_df, color_code, color)
#Should generate output below
my_df %>%
mutate(description = glue("color code: {color_code}; color: {color}"))
and with a non-quoted column name:并使用未引用的列名:
my_func <- function(df, col){
df %>%
dplyr::mutate(count = stringr::str_length({{ col }}))
}
my_func(my_df, color)
#Should generate output below
my_df %>%
dplyr::mutate(count = stringr::str_length(color))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.