[英]R regression on multiple samples
I am using R我正在使用 R
I have a panel dataset of ~5000 observations of 250 individuals over time.随着时间的推移,我有一个面板数据集,其中包含 250 个人的约 5000 次观察。
I need to build a difference in difference regression, therefore I draw a random observation for each individual and I run a regression:我需要在差异回归中建立差异,因此我对每个人进行随机观察并运行回归:
lm(x ~ x1 + x2 + ... , data = ddply(df,.(individual),function(x) x[sample(nrow(x),1),]))
over the resulting sample.在生成的样本上。
I need to compute the regression n
times on n different random samples and compute the average of each estimator.我需要对n 个不同的随机样本计算n
次回归,并计算每个估计量的平均值。
Is there a way to do this efficiently without manually computing and averaging n
regressions?有没有一种方法可以在不手动计算和平均n
回归的情况下有效地做到这一点?
Solved:解决了:
I expected to find a specific package to do it but I built a function instead.我希望找到一个特定的 package 来完成它,但我却构建了一个 function。 For example, for n = 700例如,对于 n = 700
fun <- function(alfa){
alfa <-ddply(df,.(individual),function(x) x[sample(nrow(x),1),])
beta <- lm(x ~ x1 + x2 + ... , data = alfa )$coefficients
return(beta)
}
df.full <- replicate(700,fun(alfa))
This way a dataset with 700 row is created, with the coefficient names as row.这样就创建了一个 700 行的数据集,系数名称为行。 I can do even something like this:我什至可以做这样的事情:
fun <- function(alfa){
alfa <-ddply(df,.(individual),function(x) x[sample(nrow(x),1),])
beta <- lm(x ~ x1 + x2 + ... , data = alfa)
gamma <- summary(beta)[["coefficients"]][,1]
return(gamma)
}
df.full <- replicate(700,fun(alfa))
Changing [,1] with [,2] I will obtain the standard errors.将 [,1] 更改为 [,2] 我将获得标准错误。 After this, the means' computing follows directly.在此之后,直接进行手段的计算。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.