Problem:
Building statistical models using formula
is a powerful and elegant feature of the R language. One of the reasons I haven't used formula
as much as I should is that the syntax is a bit confusing (for example x*y
does not simply mean "the product of x
and y
").
Question:
I am looking for a method to make sure that I have used the formula
syntax correctly and that the formula
I entered really implements the statistical model I have in mind. Ideally, I would like to have this confirmation before actually fitting the model.
Example:
Say, I want to find the parameters a
and b
of the model y = a + b*(x1*x2)
by linear regression. Naively, I enter this in R
df <- data.frame(y=seq(5), x1=runif(5), x2=runif(5)) # toy data
lm(y~x1*x2, data=df) # this is wrong
I can tell from the output of lm
that this is not what I wanted because of the extra coefficients for x1
and x2
. But it should be possible to debug the formula before calling the fitting function. (The correct way to fit this model would be lm(y~x1:x2, data=df)
)
One way you can debug a formula before you run the model is by using formula
and update
:
f <- formula( y ~ x1*x2)
update( f , terms( f ) )
# y ~ x1 + x2 + x1:x2
f <- formula( y ~ x1:x2)
update( f , terms( f ) )
# y ~ x1:x2
Coincidentally you can also specify the intercept term in your model (ie the coefficient for a
) by including a 1 (1* a
= a
) so this is equivalent:
f <- formula( y ~ 1 + x1:x2)
update( f , terms( f ) )
# y ~ x1:x2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.