简体   繁体   中英

Looping regressions in R

What is the best way to re-write this code into a loop?

a.data1 <- read.csv('outdata1.csv')
growth.sub.QOG1 <- merge(QOG, a.data1, by = c('year', 'country'), all = F)
growth.re1 <- plm(NY.GDP.PCAP.KD.ZG ~ log(Enrolment.in.all.programmes..Tertiary..Total)    + law + engineering + log(SP.POP.TOTL) + lp.legor
 ,data=growth.sub.QOG1, model="random")
summary(growth.re1)
eststo(growth.re1)


a.data2 <- read.csv('outdata2.csv')
growth.sub.QOG2 <- merge(QOG, a.data2, by = c('year', 'country'), all = F)
growth.re2 <- plm(NY.GDP.PCAP.KD.ZG ~ log(Enrolment.in.all.programmes..Tertiary..Total) + law + 
                  engineering + log(SP.POP.TOTL) + lp.legor
                    ,data=growth.sub.QOG2, model="random")
summary(growth.re2)
eststo(growth.re2)

a.data3 <- read.csv('outdata3.csv')
growth.sub.QOG3 <- merge(QOG, a.data3, by = c('year', 'country'), all = F)
growth.re3 <- plm(NY.GDP.PCAP.KD.ZG ~ log(Enrolment.in.all.programmes..Tertiary..Total) + law + 
                  engineering + log(SP.POP.TOTL) + lp.legor
                    ,data=growth.sub.QOG3, model="random")
summary(growth.re3)
eststo(growth.re3)

I tried to do something like this:

for (i  in 1:10) {
a.data[i] <- read.csv('outdata[i].csv')
growth.sub.QOG[i] <- merge(QOG, a.data[i], by = c('year', 'country'), all = F)
growth.re[i] <- plm(NY.GDP.PCAP.KD.ZG ~ log(Enrolment.in.all.programmes..Tertiary..Total) + law + 
                  engineering + log(SP.POP.TOTL) + lp.legor
                    ,data=growth.sub.QOG[i], model="random")
summary(growth.re[i])
eststo(growth.re[i])
}

but it didn't work, what is it that I'm doing wrong?

some sample data would have been nice but spontaneously I see the error that you won't be able to read in the file like that. try:

  file.name <- paste('outdata', i, '.csv', sep='')
  variable <- paste('a.data', i, sep='')
  data.in <- read.csv(file.name)

if you want to store it in a dynamically created variable this works like this:

  assign(variable, data.in)

this should fix the first part!

I think this works

#instance of your directory
datadir  <-"D:/Regression"
# set working directory, i.e. R knows where to get the data files 
setwd(datadir)

csvfiles <- list.files(datadir,".csv$")

#read data from datadir
for(x in csvfiles)
{
  assign(gsub(" ","",sub(".csv","",x)),read.csv(x,header=TRUE,stringsAsFactors=F,sep=";"))
}

data<-c("outdata1,outdata2,outdata3,...")

i<-1
for(x in data)
{
  tmp <- eval(parse(text=x))
  growth.sub.QOG[i]<- merge(QOG,tmp, by = c('year', 'country'), all = F)
  growth.re[i] <- plm(NY.GDP.PCAP.KD.ZG ~ log(Enrolment.in.all.programmes..Tertiary..Total)
                    + law + engineering + log(SP.POP.TOTL) + lp.legor,
                    data=tmp, model="random") 
  Summary[i]<-summary(growth.re[i]) 
  Est[i]<-eststo(growth.re[i]) 
  rm(tmp)
  i<-i+1
}

Good luck and let me know if you encounter some error...

Construct your file names.

files <- paste("outdata", 1:3, ".csv", sep = "")
#alternatively, use list.files/dir as suggested by Chris

How you structure the rest of your code depends upon whether or not you care about those intermediate variables. I've assumed that you do, so you have lots of separate loops. If you don't care, merge the lapply statements.

Read in the data.

all_data <- lapply(file, read.csv)

Merge.

merged <- lapply(all_data, function(data) 
{
  merge(QOG, data, by = c('year', 'country'), all = FALSE)
})

Model.

models <- lapply(merged, function(data)
{
  plm(
    NY.GDP.PCAP.KD.ZG ~ log(Enrolment.in.all.programmes..Tertiary..Total) + law + engineering + log(SP.POP.TOTL) + lp.legor,
    data, 
    model = "random"
  )
})

Display some output.

(summaries <- lapply(models, summary))
(eststos <- lapply(models, eststo))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM