[英]R: modeling on residuals
I have heard people talk about "modeling on the residuals" when they want to calculate some effect after an a-priori model has been made.我听说人们在制作先验 model后想要计算一些效果时谈论“对残差建模”。 For example, if they know that two variables, var_1
and var_2
are correlated, we first make a model with var_1
and then model the effect of var_2
afterwards.例如,如果他们知道var_1
和var_2
这两个变量是相关的,我们先用var_1
制作一个 model,然后再用var_2
制作 var_2 的效果。 My problem is that I've never seen this done in practice.我的问题是我在实践中从未见过这样做。
I'm interested in the following:我对以下内容感兴趣:
glm
, how do I account for the link function
used?如果我使用glm
,我如何解释使用的link function
?glm
with var_2
as explanatory variable?使用var_2
作为解释变量运行第二个glm
时,我应该选择什么分布? I assume this is related to 1.我认为这与1有关。My attempt :我的尝试:
dt <- data.table(mtcars) # I have a hypothesis that `mpg` is a function of both `cyl` and `wt`
dt[, cyl := as.factor(cyl)]
model <- stats::glm(mpg ~ cyl, family=Gamma(link="log"), data=dt) # I want to model `cyl` first
dt[, pred := stats::predict(model, type="response", newdata=dt)]
dt[, res := mpg - pred]
# will this approach work?
model2_1 <- stats::glm(mpg ~ wt + offset(pred), family=Gamma(link="log"), data=dt)
dt[, pred21 := stats::predict(model2_1, type="response", newdata=dt) ]
# or will this approach work?
model2_2 <- stats::glm(res ~ wt, family=gaussian(), data=dt)
dt[, pred22 := stats::predict(model2_2, type="response", newdata=dt) ]
My first suggested approach has convergence issues, but this is how my silly brain would approach this problem.我的第一个建议方法存在收敛问题,但这是我愚蠢的大脑处理这个问题的方式。 Thanks for any help!谢谢你的帮助!
In a sense, an ANCOVA is 'modeling on the residuals'.从某种意义上说,ANCOVA 是“对残差建模”。 The model for ANCOVA is y_i = grand_mean + treatment_i + b * (covariate - covariate_mean_i) + error for each treatment i . ANCOVA 的 model 是y_i = grand_mean +treatment_i + b * (covariate - covariate_mean_i) + error for each treatment i 。 The term (covariate - covariate_mean_i) can be seen as the residuals of a model with covariate as DV and treatment as IV.术语(covariate - covariate_mean_i)可以看作是 model 的残差,协变量为 DV,治疗为 IV。
The following regression is equivalent to this ANCOVA:以下回归等效于此 ANCOVA:
lm(y ~ treatment * scale(covariate, scale = FALSE))
Which applied to the data would look like this:应用于数据的内容如下所示:
lm(mpg ~ factor(cyl) * scale(wt, scale = FALSE), data = mtcars)
And can be turned into a glm
similar to the one you use in your example:并且可以变成类似于您在示例中使用的glm
:
glm(mpg ~ factor(cyl) * scale(wt, scale = FALSE),
family=Gamma(link="log"),
data = mtcars)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.