简体   繁体   中英

Different results for same function when parallel R computation

I have an issue with invoke an external batch using system2 function with parallel library under Windows 10, my function execute a external program to read a binary file (must be content in the same folder as example). The issue is myfunction return the right number of rows (35 rows) when is call as simple function but when parallelized give 4 rows less (31 row). Here you can find an example with all files ( https://www.dropbox.com/sh/kdoqdv5uh1rhr98/AAB86TpgVjVlFQRsTOvmZoipa?dl=0 ) my function is as following:

run function

file_to_read<-"crop@seasonal$d.UED"
library(parallel)    
cl <- makeCluster(2)
clusterEvalQ(cl, library(base))
seq_along_path_index<-seq_along(all_cells$V1)
list_results<-parLapply(cl,file_to_read,
                      myfunction)) #return 31
stopCluster(cl)
#simple call
list_results2<-myfunction(file_to_read) #return 35

My function is define as:

myfunction<- function(file_to_read) {

setwd('G:/Dropbox/Public/Example')
command<-"./UED_collate.exe"
arg1<-'./crop@seasonal$d.TDF'
crop<- base::system2(command, 
                                 args=c(arg1 , file_to_read,
                                        "--captions:", "state", "site",        "cycle", "crop"),
                                 stdout = TRUE, wait=TRUE) 
n_row<-length(crop)

return(n_row)
}

Thanks

It may come from the fact that a data frame is also a list. For example, when you use unlist(list(iris, iris)) , you get a numeric vector of size 1500. Instead, try to use unlist with recursive = FALSE if you want a list with all data frame columns, or use do.call("rbind", list(iris, iris)) instead of unlist(list(iris, iris)) if you want the data frames to be appended by rows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM