简体   繁体   中英

R: how to pass in a reference to the variable in glm or lm?

so let's say i have a named vector:

sorted = c(1,2,3)
names(sorted) = c("A","B","C")

and it'll look like following:

> sorted
A    B    C
1    2    3

so this is a vector named A,B,C, and has value 1,2,3 respectively.

and i also have a sample data:

data.ex = as.data.frame(matrix(rep(c(1,2,3,4),3), nrow = 3, ncol = 3))
colnames(data.ex) = c("A","B","C")

so this data frame has 3 columns named A,B,C as well.

I want to only predict C using value in A with glm():

fit.ex = glm(formula = C ~ names(sorted)[2],
         data = data.ex,
         family = binomial(link = "logit"))

but then, i'll keep getting the following error message:

Error in model.frame.default(formula = C ~ names(sorted)[2], data = data.ex,: 
variable lengths differ (found for 'names(sorted)[2]')

i read this article here and found the as.name() function, but still not working: http://www.ats.ucla.edu/stat/r/pages/looping_strings.htm

and i cannot find anything else thats similar to my problem. please, if there is another thread addressing this problem, guide me to it! or any kind of help is greatly appreciated! :)

Providing an answer based on the comments:

sorted = c(A=1,B=2,C=3)
names(sorted) = c("A","B","C")
data.ex = data.frame(A=1:4,B=2:5,C=c(1,0,0,1))

Construct a list of formulas:

forms <- lapply(names(sorted)[1:2],reformulate,response="C")
models <- lapply(forms,glm,data = data.ex,
                 family = binomial(link = "logit"))

Then you can do things like

t(sapply(models,coef))

The plyr package is also handy for this sort of thing (eg plyr::ldply(models,coef) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM