加快R中大型數組的平方誤差的計算

Question

基本上，我是在幫助某人為他們的研究編寫一些代碼，但是我通常的省時策略並未將其算法的運行時間減少到足夠合理的程度。 我希望其他人可能會基於我寫的一個示例，避免包含有關研究的信息，知道使函數快速運行的更好方法。

該示例中的對象小於她正在使用的對象（但可以很容易地使其變大）。 對於實際的算法，這小部分需要花費大約3分鍾的時間，但在完整情況下可能需要花費8-10分鍾的時間，並且可能需要運行1000-10000次。 這就是我需要認真減少運行時間的原因。

我目前的工作方式（希望有足夠的評論使我的思考過程顯而易見）：

example<-array(rnorm(100000), dim=c(5, 25, 40, 20))

observation <- array(rnorm(600), dim=c(5, 5, 12))

calc.err<-function(value, observation){
  #'This creates the squared error for each observation, and each point in the
  #'example array, across the five values in the first dimension of each

  sqError<-(value-observation)^2

  #'the apply function here sums up the squared error for each observation and
  #'point.  This is the value returned

  return(apply(sqError, c(2,3), function(x) sum(x)))
}

run<-apply(example, c(2,3,4), function(x) calc.err(x, observation))

#'It isn't returned in the right format (small problem) but reformatting is fast
format<-array(run, dim=c(5, 12, 25, 40, 20))

如有必要將澄清。

編輯：data.table包似乎非常有幫助。 我將不得不學習該軟件包，但是預備似乎要快得多。 我想我正在使用數組，因為她給我做的代碼使對象格式化的速度更快。 甚至都沒想過要改變它

Answer 1

這是幾個簡單的重構以及時序：

calc.err2 <- function(value, observation){
  #'This creates the squared error for each observation, and each point in the
  #'example array, across the five values in the first dimension of each

  sqError<-(value-observation)^2

  #' getting rid of the anonymous function

  apply(sqError, c(2,3), sum)
}

calc.err3 <- function(value, observation){
  #'This creates the squared error for each observation, and each point in the
  #'example array, across the five values in the first dimension of each

  sqError<-(value-observation)^2

  #' replacing with colSums

  colSums(sqError)
}


R>microbenchmark(times=8, apply(example, 2:4, calc.err, observation),
+   apply(example, 2:4, calc.err2, observation),
+   apply(example, 2:4, calc.err3, observation)
+ )
Unit: milliseconds
                                        expr         min          lq
  apply(example, 2:4, calc.err, observation) 2284.350162 2321.875878
 apply(example, 2:4, calc.err2, observation) 2194.316755 2257.007572
 apply(example, 2:4, calc.err3, observation)  645.004808  652.567611
         mean       median           uq         max neval
 2349.7524509 2336.6661645 2393.3452420 2409.894876     8
 2301.7896566 2298.9346090 2362.5479790 2383.020177     8
  681.3176878  667.9070175  720.7049605  723.177516     8

colSums比相應的apply快得多。

加快R中大型數組的平方誤差的計算

問題描述

1 個解決方案

解決方案1
0 2015-07-06 16:44:55

加快R中大型數組的平方誤差的計算

問題描述

1 個解決方案

解決方案1 0 2015-07-06 16:44:55

解決方案1
0 2015-07-06 16:44:55