简体   繁体   English

function 中的 R 全局赋值运算符 - 还有什么更好的选择?

[英]R Global assignment operator in a function - whats a better alternative?

I have a function in a package (mainly for my own use currently, might share at some future point).我在 package 中有一个 function(目前主要供我自己使用,将来可能会分享)。 I'm trying to replace a slow for loop with an lapply so that later I can parallelise it.我正在尝试用 lapply 替换慢速 for 循环,以便以后可以并行化它。 So one option I found that is hugely faster even without parellelisation is to use the global assignment operator.因此,我发现即使没有并行化也能更快的一种选择是使用全局赋值运算符。 However I'm anxious about this as this seems to be frowned upon, and I'm not used to thinking about environments and so worry about side effects:但是我对此感到焦虑,因为这似乎不受欢迎,而且我不习惯考虑环境,因此担心副作用:

Here is a simple reprex:这是一个简单的代表:



n <- 2
nx <- 40
v <- 5
d <- 3

array4d <- array(rep(0, n * nx * v * d) ,
                       dim = c(n, nx, v, d) )
array4d2 <- array4d

# Make some data to enter into the array - in real problem a function gens this data depending on input vars

set.seed(4)
dummy_output <- lapply(1:v, function(i) runif(n*nx*d))

microbenchmark::microbenchmark( {
    for(i in 1:v){
        array4d[ , , i, ] <- dummy_output[[i]]
    }
}, {
    lapply(1: v, function(i) {
        array4d2[ , , i, ] <<- dummy_output[[i]]
    })
})

Unit: microseconds
                                                                                     expr      min        lq
             {     for (i in 1:v) {         array4d[, , i, ] <- dummy_output[[i]]     } } 1183.504 1273.6205
 {     lapply(1:v, function(i) {         array4d2[, , i, ] <<- dummy_output[[i]]     }) }   13.257   16.1715
       mean    median       uq      max neval cld
 1488.26909 1411.4565 1515.762 3535.974   100   b
   33.56976   18.1445   21.150 1525.608   100  a 
> 
> identical(array4d, array4d2)
[1] TRUE

All of this would be happening inside a function called many times by its parent.所有这些都将在其父级多次调用的 function 中发生。

So this is (lots.) faster.所以这(很多)更快。 But my questions are但我的问题是

  1. Is it safe to do this?这样做安全吗?
  2. Is there a similarly fast alternative that does not use <<- ?是否有不使用<<-的类似快速替代方案?

Make the varying dimension the last one.使可变维度成为最后一个维度。 microbenchmark indicates that its performance is not statistically different than the one using a global variable. microbenchmark 表明其性能与使用全局变量的性能在统计上没有差异。 If it is important that the dimension be the third use aperm(x, c(1, 2, 4, 3)) afterwards.如果重要的是维度是第三个使用aperm(x, c(1, 2, 4, 3))之后。

microbenchmark::microbenchmark( 
    a = for(i in 1:v) array4d[ , , i, ] <- dummy_output[[i]],
    b = lapply(1: v, function(i) array4d2[ , , i, ] <<- dummy_output[[i]]),
    c = array(unlist(dummy_output), dim(array4d3))
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM