I have a function, and i want to append to a vector of variables some variables in some specific cases. This later becomes to a regression formuls
Code example:
some_function <- function (df,mdl){
vars <- c("var1", "var2", "var3")
vars <- case_when(mdl== "model1" ~ vars<-("var3", "var4", vars),
mdl== "model2" ~ vars<-("var4", "var5", vars))
target_col <- "count"
target_formula <- as.formula(sprintf("%s ~ %s",
target_col,
paste(vars, collapse = " + ")))
}
mdl is an acronym made of text supposed to represent the different models, there are about 8-10
You can define model1
/ model2
in a list, then map
with as.formula
:
library(purrr)
library(glue)
vars <- c("var1", "var2", "var3")
target <- "count"
models <- list(model1 = c("var3", "var4", vars),
model2 = c("var4", "var5", vars))
map(models, ~as.formula(glue("{target} ~ {paste(., collapse = ' + ')}")))
Output:
$model1
count ~ var3 + var4 + var1 + var2 + var3
<environment: 0x7fec25b8b770>
$model2
count ~ var4 + var5 + var1 + var2 + var3
<environment: 0x7fecf096b560>
We can use reformulate
to create the models in base R
vars <- c("var1", "var2", "var3")
target <- "count"
models <- list(model1 = c("var3", "var4", vars),
model2 = c("var4", "var5", vars))
lapply(models, reformulate, response = target)
-output
#$model1
#count ~ var3 + var4 + var1 + var2 + var3
#<environment: 0x7f92c7658ed8>
#$model2
#count ~ var4 + var5 + var1 + var2 + var3
#<environment: 0x7f92c7649a88>
It can be wrapped in a function
and use conditions with if/else
and the dataset input to function seems to be not used in the OP's post
some_function <- function (df,mdl){
vars <- c("var1", "var2", "var3")
vars <- if(mdl == "model1") {
c(vars, "var4")
} else c(vars, "var5")
target_col <- "count"
reformulate(vars, response = target)
}
-testing
some_function(iris, "model1")
#count ~ var1 + var2 + var3 + var4
#<environment: 0x7f92f0c77370>
some_function(iris, "model2")
#count ~ var1 + var2 + var3 + var5
#<environment: 0x7f92f0ce6db8>
Don't do assignment within case_when
, it is almost never a good thing to attempt. Instead, try this:
some_function <- function (df,mdl){
newvars <- dplyr::case_when(
mdl == "model1" ~ c("var3", "var4"),
mdl == "model2" ~ c("var4", "var5")
)
vars <- c("var1", "var2", "var3", newvars)
# something else here
vars
}
some_function(mtcars, "model1")
# [1] "var1" "var2" "var3" "var3" "var4"
some_function(mtcars, "model2")
# [1] "var1" "var2" "var3" "var4" "var5"
That seems to be okay, but there are two things that could be improved.
The first one repeats "var3"
, perhaps we can add vars <- unique(vars)
to the function.
case_when
is really meant to be a vectorized replacement for nested ifelse
(or dplyr::if_else
), where nesting works but can make things hard to follow/maintain. Because of this, this suggests that mdl
could be length greater than 1. But when we pass a length-2 argument:
some_function(mtcars, c("model1", "model2")) # [1] "var1" "var2" "var3" "var3" "var5"
The first comparison in case_when
finds that mdl == "model1"
matches the first vector, but it only uses the first of the c("var3","var4")
. Further, if we pass many more, then we get errors about incompatible vector lengths.
I suspect that you are intending mdl
to be length 1, in which case you likely have several other model numbers and want a set of additional variables to add to the defaults. Perhaps this is ultimately what you want?
some_function <- function (df, mdl){
newvars <- switch(
mdl[1],
model1 = c("var3", "var4"),
model2 = c("var4", "var5"),
stop("unrecognized model: ", sQuote(mdl))
)
vars <- c("var1", "var2", "var3", newvars)
# something else here
vars
}
some_function(mtcars, "model1")
# [1] "var1" "var2" "var3" "var3" "var4"
some_function(mtcars, "model2")
# [1] "var1" "var2" "var3" "var4" "var5"
some_function(mtcars, "model3")
# Error in some_function(mtcars, "model3") : unrecognized model: 'model3'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.