简体   繁体   中英

how to speed up this R function with Rcpp?

I want to write out a lot of text files with a loop in R, but I do not know how to speed up it with Rcpp? test data and R function are as follow:

mywrite<- function(data,dataid){
  for(i in unique(dataid$id)) {
    yearid=data[["year"]][i==data[["id"]]]
    for(yr in yearid) {
      fname=paste(i,sprintf("%03d",yr%%1000),sep=".")
      write.table(dataid[i,],file=fname,row.names=FALSE,col.names=FALSE)
      write.table(subset(data,year==yr&id==i),file=fname,row.names=FALSE,col.names=FALSE,append=TRUE)
    }
  }
}

data=data.frame(id=rep(1:5,4),year=rep(1991:2000,2),x=rep(1,40),y=rep(1,40))
dataid=data.frame(id=1:5,lat=31:35,lon=101:105)
mywrite(data,dataid)

PS:it takes about 50 minutes using R for writing out such 30000 text files, while only 10 minutes using FORTRAN.

I got a 4x speedup by eliminating all the redundant subset ing in the loop, and using a split-apply strategy instead:

mywrite2<- function(data,dataid){
  by(data, interaction(data$year, data$id), function(x){
             i <- x$id[1]
             yr <- x$year[1]
             fname=paste(i,sprintf("%03d.2",yr%%1000),sep=".")
             write.table(dataid[i,],file=fname,row.names=FALSE,col.names=FALSE)
             write.table(x, file=fname, row.names=FALSE, col.names=FALSE, append=TRUE) 
          })
}


> require(microbenchmark)
> microbenchmark(
+ mywrite(data,dataid)
+ ,
+ mywrite2(data,dataid)
+ )
Unit: milliseconds
                   expr       min         lq        mean     median         uq        max neval
  mywrite(data, dataid) 76.613679 77.4709100 78.86304895 78.0260815 78.7791595 128.463443   100
 mywrite2(data, dataid) 18.894828 19.1707455 20.12820819 19.4053135 21.2940880  23.101325   100

This will probably get it in the ballpark of your fortran code, no Rcpp required.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM