简体   繁体   中英

R block resampling by unique identifier for bootstrap

I am attempting to block bootstrap a dataset using R. I have a data frame of firms in counties. I want to sample counties with replacement, then build a dataset with all firms in that sample of counties (with replacement). I run a regression on the new dataset. Then I sample again.

I have a for loop that works like so:

for(j in 1:10000){
y=NULL
for(i in 1:length(unique(data$firm_id))){
    y=rbind(y, data[which(data$county_id==sample(unique(data$county_id), replace=T)[i]),])
}
    a=rbind(a, lm(profit~employees, data=y)$coefficients)
}

Unfortunately, this sort of for loop in R is extremely slow and computationally expensive. Is it possible to implement this using a more efficient apply function?

something like this could help:

positions<-replicate(1000, sample(1:nrow(df), nrow(df), T))

apply(positions, 2, function(i) lm(yvar[i]~xvar[i], df)$coef)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM