[英]Weighted linear regression in R with lm() and svyglm(). Same model, different results
I want to do a linear regression applying survey weights in R studio.我想在 R studio 中应用调查权重进行线性回归。 I have seen that it is possible to do this with the
lm()
function, which enables me to specify the weights I want to use.我已经看到可以使用
lm()
函数来做到这一点,这使我能够指定我想要使用的权重。 However, it is also possible to do this with the svyglm()
function, which does the regression with variables in a survey design object which has been weighted by the desired variable.但是,也可以使用
svyglm()
函数执行此操作,该函数对调查设计对象中的变量进行回归,该对象已由所需变量加权。
In theory, I see no reason for the results of these two regression models to be different, and the beta estimates are the same.从理论上讲,我认为这两种回归模型的结果没有任何不同的原因,并且 beta 估计值是相同的。 However, the standard errors in each model are different, leading to different p-values and therefore to different levels of significance.
然而,每个模型中的标准误差是不同的,导致不同的 p 值,从而导致不同的显着性水平。
Which model is the most appropriate one?哪种型号最合适? Any help would be greatly appreciated.
任何帮助将不胜感激。
Here is the R code:这是R代码:
dat <- read.csv("https://raw.githubusercontent.com/LucasTremlett/questions/master/questiondata.csv")
model.weighted1 <- lm(DV~IV1+IV2+IV3, data=dat, weights = weight)
summary(model.weighted1)
dat.weighted<- svydesign(ids = ~1, data = dat, weights = dat$weight)
model.weighted2<- svyglm(DV~IV1+IV2+IV3, design=dat.weighted)
summary(model.weighted2)
Mostly to confirm what is in the comments already:主要是为了确认评论中的内容:
lm
and svyglm
will always give the same point estimates, but will typically give different standard errors. lm
和svyglm
将始终给出相同的点估计,但通常会给出不同的标准误差。 In the terminologyI use here , and which @BenBolker already links (Thanks!) , lm
assumes precision weights and svyglm
assumes sampling weightslm
假设精确权重, svyglm
假设采样权重svyglm
svyglm
svydesign
and would be used to reduce the standard errors in svyglm
svydesign
并用于减少svyglm
的标准误差
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.