I'm attempting to establish a user-defined function that inputs predetermined variables (independent and dependent) from the active data frame. Let's take the example data frame df
below looking at a coin toss outcome as a result of other recorded variables:
> df
outcome toss person hand age
1 H 1 Mary Left 18
2 T 2 Allen Left 12
3 T 3 Dom Left 25
4 T 4 Francesca Left 42
5 H 5 Mary Right 18
6 H 6 Allen Right 12
7 H 7 Dom Right 25
8 T 8 Francesca Right 42
The df
data frame has a binomial response outcome
being either heads or tails and I am going to look at how person
, hand
, and age
might affect this categorical outcome. I plan to use a forward-selection approach which will test one variable against toss
and then progress to add more.
As to keep things simple, I want to be able to identify the response/dependent (eg, outcome
) and predictor/independent (eg, person
, hand
) variables before my user-defined function as such:
> independent<-c('person','hand','age')
> dependent<-'outcome'
Then create my function using the lapply
and glm
functions:
> test.func<-function(some_data,the_response,the_predictors)
+ {
+ lapply(the_predictors,function(a)
+ {
+ glm(substitute(as.name(the_response)~i,list(i=as.name(a))),data=some_data,family=binomial)
+ })
+ }
Yet, when I attempt to run the function with the predetermined vectors, this occurs:
> test.func(df,dependent,independent)
Error in as.name(the_response) : object 'the_response' not found
My expected response would be the following:
models<-lapply(independent,function(x)
+ {
+ glm(substitute(outcome~i,list(i=as.name(x))),data=df,family=binomial)
+ })
> models
[[1]]
Call: glm(formula = substitute(outcome ~ i, list(i = as.name(x))),
family = binomial, data = df)
Coefficients:
(Intercept) personDom personFrancesca personMary
1.489e-16 -1.799e-16 1.957e+01 -1.957e+01
Degrees of Freedom: 7 Total (i.e. Null); 4 Residual
Null Deviance: 11.09
Residual Deviance: 5.545 AIC: 13.55
[[2]]
Call: glm(formula = substitute(outcome ~ i, list(i = as.name(x))),
family = binomial, data = df)
**End Snippet**
As you can tell, using lapply
and glm
, I have created 3 simple models without all of the extra work doing it individually. You may be asking why create a user-defined function when you have simple code right there? I plan to run a while
or repeat
loop and it will decrease clutter.
Thank you for your assistance
I know code only answers are deprecated but I thought you were almost there and could just use the nudge to use the formula
function (and to include 'the_response in the substitution):
test.func<-function(some_data,the_response,the_predictors)
{
lapply(the_predictors,function(a)
{print( form<- formula(substitute(resp~i,
list(resp=as.name(the_response), i=as.name(a)))))
glm(form, data=some_data,family=binomial)
})
}
Test:
> test.func(df,dependent,independent)
outcome ~ person
<environment: 0x7f91a1ba5588>
outcome ~ hand
<environment: 0x7f91a2b38098>
outcome ~ age
<environment: 0x7f91a3fad468>
[[1]]
Call: glm(formula = form, family = binomial, data = some_data)
Coefficients:
(Intercept) personDom personFrancesca personMary
8.996e-17 -1.540e-16 1.957e+01 -1.957e+01
Degrees of Freedom: 7 Total (i.e. Null); 4 Residual
Null Deviance: 11.09
Residual Deviance: 5.545 AIC: 13.55
[[2]]
Call: glm(formula = form, family = binomial, data = some_data)
#snipped
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.