为什么R for循环比使用foreach慢10倍？

Question

This is really blowing my mind. 这真让我大吃一惊。 The basic loop takes like 8 seconds on my computer: 基本循环在我的计算机上花了8秒钟：

system.time({
x <- 0
for (p in 1:2) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
}
})
x

Whereas if I use foreach in non-parallel mode, it does take only 0.7 secs!!! 然而，如果我在非并行模式下使用foreach ，它只需要0.7秒！

system.time({
x <- 0
foreach(p = 1:2, .combine = rbind) %do% 
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
})
x

The result is the same, but foreach was somehow able to reach it much faster than basic R! 结果是一样的，但是foreach能以某种方式比基本的R更快地达到它！ Where is the inefficiency of basic R? 基本R的低效率在哪里？

How is this possible? 这怎么可能？

In fact, I got complete opposite result compared to this one: Why is foreach() %do% sometimes slower than for? 事实上，与此相比，我获得了完全相反的结果：为什么foreach（）％do％有时慢于？

Answer 1

foreach when used sequentially eventually uses compiler to produce compiled byte code using the non-exported functions make.codeBuf and cmp . foreach在顺序使用时最终使用compiler使用非导出函数make.codeBuf和cmp生成编译的字节代码。 You can use cmpfun to compile the innerloop into bytecode to simulate this and achieve a similar speedup. 您可以使用cmpfun将cmpfun编译为字节码来模拟这个并实现类似的加速。

f.original <- function() {
x <- 0
for (p in 1:2) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
}
x
}

f.foreach <- function() {
x <- 0
foreach(p = 1:2, .combine = rbind) %do% 
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
        }
    }
x
}

f.cmpfun <- function(x) {
f <- cmpfun(function(x) {
    for (i in 1:500) {
        for (j in 1:5000) {
            x <- x + i * j
            }
        }
        x
    })
    f(f(0))
}

Results 结果

library(microbenchmark)
microbenchmark(f.original(),f.foreach(),f.cmpfun(), times=5)
Unit: milliseconds
         expr       min        lq    median        uq       max neval
 f.original() 4033.6114 4051.5422 4061.7211 4072.6700 4079.0338     5
  f.foreach()  426.0977  429.6853  434.0246  437.0178  447.9809     5
   f.cmpfun()  418.2016  427.9036  441.7873  444.1142  444.4260     5
all.equal(f.original(),f.foreach(),f.cmpfun())
[1] TRUE

为什么R for循环比使用foreach慢10倍？

问题描述

How is this possible? 这怎么可能？

1 个解决方案

解决方案1
9 已采纳 2014-07-09 12:43:32

为什么R for循环比使用foreach慢10倍？

问题描述

How is this possible? 这怎么可能？

1 个解决方案

解决方案1 9 已采纳 2014-07-09 12:43:32

解决方案1
9 已采纳 2014-07-09 12:43:32