I want to use the na_omit
function from the collapse package in a user-defined function. na_omit
requires a column name to be in quotes as one of its arguments. If I didn't need the column name in quotes, I could just refer to the column name in double braces, {{col}}
, as mentioned in this vignette, "Programming with dplyr" . If I refer to the column using the glue package, such as glue::glue("{col}")
, I receive errors.
Here is a reprex:
my_df <-
data.frame(
matrix(
c(
"V9G","Blue",
NA,"Red",
"J4C","White",
NA,"Brown",
"F7B","Orange",
"G3V","Green"
),
nrow = 6,
ncol = 2,
byrow = TRUE,
dimnames = list(NULL,
c("color_code", "color"))
),
stringsAsFactors = FALSE
)
library(collapse)
library(dplyr)
library(glue)
my_func <- function(df, col){
df %>%
collapse::na_omit(cols = c(glue("{col}"))) #Here is the code that fails
}
my_func(my_df, color_code)
The expected output can be generated with the following:
my_df %>%
collapse::na_omit(cols = c("color_code"))
and should produce:
# color_code color
#1 V9G Blue
#2 J4C White
#3 F7B Orange
#4 G3V Green
How should I refer to a quoted column name that's a parameter and an argument of a function within a user-defined function in R?
In general, collapse is mostly standard evaluation and its NSE features are based upon base R, so most of the rlang and glue stuff etc. won't work, but you will have simpler and faster code. As suggested, for a single column, a solution would be function(df, col) { col <- as.character(substitute(col)); ...; }
function(df, col) { col <- as.character(substitute(col)); ...; }
function(df, col) { col <- as.character(substitute(col)); ...; }
ie use substitute()
to capture the expression and as.character
or all.vars
to extract the variables. For multiple columns a general solution is wrapping fselect
eg
library(collapse)
my_func <- function(df, ...) {
cols <- fselect(df, ..., return = "indices")
na_omit(df, cols = cols)
}
my_func(wlddev, PCGDP:GINI, POP) |> head()
#> country iso3c date year decade region
#> 1 Albania ALB 1997-01-01 1996 1990 Europe & Central Asia
#> 2 Albania ALB 2003-01-01 2002 2000 Europe & Central Asia
#> 3 Albania ALB 2006-01-01 2005 2000 Europe & Central Asia
#> 4 Albania ALB 2009-01-01 2008 2000 Europe & Central Asia
#> 5 Albania ALB 2013-01-01 2012 2010 Europe & Central Asia
#> 6 Albania ALB 2015-01-01 2014 2010 Europe & Central Asia
#> income OECD PCGDP LIFEEX GINI ODA POP
#> 1 Upper middle income FALSE 1869.866 72.495 27.0 294089996 3168033
#> 2 Upper middle income FALSE 2572.721 74.579 31.7 453309998 3051010
#> 3 Upper middle income FALSE 3062.674 75.228 30.6 354950012 3011487
#> 4 Upper middle income FALSE 3775.581 75.912 30.0 338510010 2947314
#> 5 Upper middle income FALSE 4276.608 77.252 29.0 335769989 2900401
#> 6 Upper middle income FALSE 4413.297 77.813 34.6 260779999 2889104
Created on 2022-02-03 by the reprex package (v2.0.1)
You have to provide col name as a character, like:
my_func <- function(df, col){
df %>%
collapse::na_omit(cols = c(glue("{col}"))) #Here is the code that fails
}
my_func(my_df, col = "color_code")
It's important to first determine what environment in R you're programming in. Are you in dplyr or base R? If in dplyr , then reference the documentation for programming with dplyr , rlang , glue , and this stackoverflow answer . If in base R, reference the documentation on non-standard evaluation . Reasons for this question comes from environment confusion. Here are some of the different approaches in a reprex.
Data
my_df <-
data.frame(
matrix(
c(
"V9G","Blue",
NA,"Red",
"J4C","White",
NA,"Brown",
"F7B","Orange",
"G3V","Green"
),
nrow = 6,
ncol = 2,
byrow = TRUE,
dimnames = list(NULL,
c("color_code", "color"))
),
stringsAsFactors = FALSE
)
Packages
library(collapse)
library(dplyr)
library(stringr)
library(glue)
Functional Programming in base R
with a quoted column name:
my_func <- function(df, col) {
col_char_ref <- deparse(substitute(col)) #Use deparse(substitute()) to refer to a quoted column name
df %>%
collapse::na_omit(cols = col_char_ref)
}
my_func(my_df, color_code)
#Should generate output below
my_df %>%
collapse::na_omit(cols = "color_code")
and with a non-quoted column name:
my_func <- function(df, col){
env <- list2env(df, parent = parent.frame())
col <- substitute(col)
df %>%
collapse::ftransform(count = stringr::str_length(eval(col, env)))
}
my_func(my_df, color)
#Should generate output below
my_df %>%
collapse::ftransform(count = stringr::str_length(color))
Functional programming in dplyr
with a quoted column name using glue and dplyr functions:
my_func <- function(df, col1, col2) {
df %>%
mutate(description := glue("color code: {pull(., {{col1}})}; color: {pull(., {{col2}})}"))
}
my_func(my_df, color_code, color)
#Should generate output below
my_df %>%
mutate(description = glue("color code: {color_code}; color: {color}"))
or with a quoted column name using a C language wrapper function:
my_func <- function(df, col1, col2) {
df %>%
mutate(description := sprintf("color code: %s; color: %s", {{col1}}, {{col2}}))
}
my_func(my_df, color_code, color)
#Should generate output below
my_df %>%
mutate(description = glue("color code: {color_code}; color: {color}"))
and with a non-quoted column name:
my_func <- function(df, col){
df %>%
dplyr::mutate(count = stringr::str_length({{ col }}))
}
my_func(my_df, color)
#Should generate output below
my_df %>%
dplyr::mutate(count = stringr::str_length(color))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.