简体   繁体   English

R对文件的并行系统调用

[英]R parallel system call on files

I have to convert a large number of RAW images and am using the program DCRAW to do that. 我必须转换大量RAW图像,并且正在使用程序DCRAW来执行该操作。 Since this program is only using one core I want to parallelize this in R. To call this function I use: 由于此程序仅使用一个内核,因此我想在R中并行化它。要调用此函数,请使用:

system("dcraw.exe -4 -T image.NEF")

This results in outputting a file called image.tiff in the same folder as the NEF file, which is totally fine. 这导致在与NEF文件相同的文件夹中输出名为image.tiff的文件,这完全可以。 Now I tried multiple R packages to parallelize this but I only get nonsensical returns (probably caused by me). 现在,我尝试了多个R程序包对此进行并行化处理,但是我只能得到毫无意义的回报(可能是由我造成的)。 I want to run a large list (1000+ files) through this system call in r , obtained by list.files() 我想通过list.files()获得的r中的此系统调用运行一个大列表(1000多个文件)

I could only find info on parallel programming for variables within R but not for system calls. 我只能找到有关R中变量的并行编程信息,而不能找到系统调用信息。 Anybody got any ideas? 有人有想法吗? Thanks! 谢谢!

It doesnt' matter if you use variables or system . 使用变量还是system没有关系。 Assuming you're not on Windows (which doesn't support parallel), on any decent system you can run 假设您不在Windows(不支持并行)上,则可以在任何体面的系统上运行

parallel::mclapply(Sys.glob("*.NEF"),
  function(fn) system(paste("dcraw.exe -4 -T", shQuote(fn))),
  mc.cores=8, mc.preschedule=F)

It will run 8 jobs in parallel. 它将并行运行8个作业。 But then you may as well not use R and use instead 但是那样的话您最好不要使用R,而应使用

ls *.NEF | parallel -u -j8 'dcraw.exe -4 -T {}'

instead (using GNU parallel). 相反(使用GNU并行)。

On Windows I use a modification of this solution (the top voted one) to run many commands with no more than, say, 4 or 8 simultaneously: 在Windows上,我使用此解决方案的一种修改形式(最高投票者),以同时运行不超过4或8个命令的方式运行许多命令:

Parallel execution of shell processes 外壳程序的并行执行

It's not an R solution, but I like it. 它不是R解决方案,但我喜欢它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM