[英]Is my R formula equivalent to the statistical model I have in mind?
Problem: 问题:
Building statistical models using formula
is a powerful and elegant feature of the R language. 使用
formula
构建统计模型是R语言的强大而优雅的特征。 One of the reasons I haven't used formula
as much as I should is that the syntax is a bit confusing (for example x*y
does not simply mean "the product of x
and y
"). 我没有尽可能多地使用
formula
的原因之一是语法有点混乱(例如x*y
不仅仅意味着“ x
和y
的乘积”)。
Question: 题:
I am looking for a method to make sure that I have used the formula
syntax correctly and that the formula
I entered really implements the statistical model I have in mind. 我正在寻找一种方法来确保我正确使用了
formula
语法,并且我输入的formula
确实实现了我想到的统计模型。 Ideally, I would like to have this confirmation before actually fitting the model. 理想情况下,我希望在实际拟合模型之前得到此确认。
Example: 例:
Say, I want to find the parameters a
and b
of the model y = a + b*(x1*x2)
by linear regression. 说,我想通过线性回归找到模型
y = a + b*(x1*x2)
的参数a
和b
。 Naively, I enter this in R 天真地,我在R中输入
df <- data.frame(y=seq(5), x1=runif(5), x2=runif(5)) # toy data
lm(y~x1*x2, data=df) # this is wrong
I can tell from the output of lm
that this is not what I wanted because of the extra coefficients for x1
and x2
. 我可以从
lm
的输出中看出,由于x1
和x2
的额外系数,这不是我想要的。 But it should be possible to debug the formula before calling the fitting function. 但是在调用拟合函数之前应该可以调试公式。 (The correct way to fit this model would be
lm(y~x1:x2, data=df)
) (适合此模型的正确方法是
lm(y~x1:x2, data=df)
)
One way you can debug a formula before you run the model is by using formula
and update
: 在运行模型之前调试公式的一种方法是使用
formula
和update
:
f <- formula( y ~ x1*x2)
update( f , terms( f ) )
# y ~ x1 + x2 + x1:x2
f <- formula( y ~ x1:x2)
update( f , terms( f ) )
# y ~ x1:x2
Coincidentally you can also specify the intercept term in your model (ie the coefficient for a
) by including a 1 (1* a
= a
) so this is equivalent: 巧合的是,你还可以指定你的模型截距项(即系数为
a
由包括1(1 *) a
= a
),所以这是等价的:
f <- formula( y ~ 1 + x1:x2)
update( f , terms( f ) )
# y ~ x1:x2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.