简体   繁体   中英

user defined variables in a function in r

I am trying to do a generic function to construct a formula for lineal regression. I want that the function create the formula either

  • using user defined variables or,
  • using all the variables present in the dataframe.

I can create the formula using all the variables present in the dataframe but my problem is when I try to get the user defined variables, I do not know exactly how to get the variables to later use them to create the formula.

The function that I have until now is this:

lmformula <- function (data, IndepVariable = character, VariableList = TRUE){
  if (VariableList) {
newlist <- list()
newlist <-  # Here is where I do not exactly what to do to extract the variables defined by user
DependVariables <- newlist
f <- as.formula(paste(IndepVariable, "~", paste((DependVariables), collapse = '+')))
 }else {
names(data) <- make.names(colnames(data))
DependVariables <- names(data)[!colnames(data)%in% IndepVariable]
f <- as.formula(paste(IndepVariable,"~", paste((DependVariables), collapse = '+')))
return (f)
 }
}

Please any hint will be deeply appreciated

The only thing that changes is how you get the independent variables

If the user specifies them, then use that character vector directly

Else, you have to to take all the variables other than the dependent variable(which you are already doing)

Note : As Roland mentioned, the formula is like dependentVariable ~ independentVariable1 + independentVariable2 + independentVariable3

# creating mock data
data <- data.frame(col1 = numeric(0), col2 = numeric(0), col3 = numeric(0), col4 = numeric(0))

# the function
lmformula <- function (data, DepVariable, IndepVariable, VariableList = TRUE) {
  if (!VariableList) {
    IndepVariable <- names(data)[!names(data) %in% DepVariable]
  }
  f <- as.formula(paste(DepVariable,"~", paste(IndepVariable, collapse = '+')))
  return (f)
}

# working examples
lmformula(data = data, DepVariable = "col1", VariableList = FALSE)
lmformula(data = data, DepVariable = "col1", IndepVariable = c("col2", "col3"), VariableList = TRUE)

Hope it helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM