简体   繁体   中英

R parallel system call on files

I have to convert a large number of RAW images and am using the program DCRAW to do that. Since this program is only using one core I want to parallelize this in R. To call this function I use:

system("dcraw.exe -4 -T image.NEF")

This results in outputting a file called image.tiff in the same folder as the NEF file, which is totally fine. Now I tried multiple R packages to parallelize this but I only get nonsensical returns (probably caused by me). I want to run a large list (1000+ files) through this system call in r , obtained by list.files()

I could only find info on parallel programming for variables within R but not for system calls. Anybody got any ideas? Thanks!

It doesnt' matter if you use variables or system . Assuming you're not on Windows (which doesn't support parallel), on any decent system you can run

parallel::mclapply(Sys.glob("*.NEF"),
  function(fn) system(paste("dcraw.exe -4 -T", shQuote(fn))),
  mc.cores=8, mc.preschedule=F)

It will run 8 jobs in parallel. But then you may as well not use R and use instead

ls *.NEF | parallel -u -j8 'dcraw.exe -4 -T {}'

instead (using GNU parallel).

On Windows I use a modification of this solution (the top voted one) to run many commands with no more than, say, 4 or 8 simultaneously:

Parallel execution of shell processes

It's not an R solution, but I like it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM