I want to write out a lot of text files with a loop in R, but I do not know how to speed up it with Rcpp? test data and R function are as follow:
mywrite<- function(data,dataid){
for(i in unique(dataid$id)) {
yearid=data[["year"]][i==data[["id"]]]
for(yr in yearid) {
fname=paste(i,sprintf("%03d",yr%%1000),sep=".")
write.table(dataid[i,],file=fname,row.names=FALSE,col.names=FALSE)
write.table(subset(data,year==yr&id==i),file=fname,row.names=FALSE,col.names=FALSE,append=TRUE)
}
}
}
data=data.frame(id=rep(1:5,4),year=rep(1991:2000,2),x=rep(1,40),y=rep(1,40))
dataid=data.frame(id=1:5,lat=31:35,lon=101:105)
mywrite(data,dataid)
PS:it takes about 50 minutes using R for writing out such 30000 text files, while only 10 minutes using FORTRAN.
I got a 4x speedup by eliminating all the redundant subset
ing in the loop, and using a split-apply strategy instead:
mywrite2<- function(data,dataid){
by(data, interaction(data$year, data$id), function(x){
i <- x$id[1]
yr <- x$year[1]
fname=paste(i,sprintf("%03d.2",yr%%1000),sep=".")
write.table(dataid[i,],file=fname,row.names=FALSE,col.names=FALSE)
write.table(x, file=fname, row.names=FALSE, col.names=FALSE, append=TRUE)
})
}
> require(microbenchmark)
> microbenchmark(
+ mywrite(data,dataid)
+ ,
+ mywrite2(data,dataid)
+ )
Unit: milliseconds
expr min lq mean median uq max neval
mywrite(data, dataid) 76.613679 77.4709100 78.86304895 78.0260815 78.7791595 128.463443 100
mywrite2(data, dataid) 18.894828 19.1707455 20.12820819 19.4053135 21.2940880 23.101325 100
This will probably get it in the ballpark of your fortran code, no Rcpp required.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.