简体   繁体   中英

R vs. Matlab: Explanation for speed difference for rnorm, qnorm and pnorm functions

I compared the performance of the inbuilt R functions rnorm , qnorm and pnorm to the equivalent Matlab functions.

It seems as if the rnorm and pnorm functions are 3-6 times slower in R than in Matlab, whereas the qnorm function is ca. 40% faster in R. I tried the Rcpp package to speed up the R functions by using the corresponding C libraries which resulted in a decrease in runtime by ~30% which is still significantly slower than Matlab for rnorm and pnorm .

Is there a package availabe which provides a faster way of simulating normally distributed random variables in R (other than using the standard rnorm function)?

I see two distinct issues here, one in each paragraph:

  • Yes, there are difference between languages / systems such as R and Matlab. Part of it has to do with the interpreter, speed of loops, speed of function calls etc pp. Rcpp can help there with respect to Matlab which has a genuine JIT compiler. We have a comparison between Matlab, R and R+Rcpp for a Kalman filter in the recent paper on RcppArmadillo.

  • There also are difference in the underlying compiled code, and yes, R does not always have the faster implementation as R Core (IMHO rightly) goes for precision first. (And Rcpp does not help per se: We just call what R has internally.) This had come up eg with the Gibbs Sampler example for MCMC which Darren Wilkinson started. I noticed that R's rgamma() is much slower than other systems. So to get to your question regarding N(0,1) draws in a faster way: I think we need a contributed Ziggurat implementation. That is one of the faster N(0,1) generators out there, and a few other systems use it.

To promote my comment to an answer: yes, there is.

library("sos"); findFn("Ziggurat") library("sos"); findFn("Ziggurat") finds the rziggurat function in the SuppDists package; it is implemented in C (or C++?), and its documentation says

This implementation running in R is approximately three times as fast as rnorm().

The other point to note, which may make as much or more difference in practice, is that picking big block of random numbers is much faster in R than picking them one-by-one ... ie rnorm(1e6) is much faster than vapply(seq(1e6),function(i) rnorm(1),numeric(1))

 library("SuppDists")
 library("rbenchmark")
 n <- 1e5
 benchmark(rziggurat(n),
          rnorm(n),
          vapply(seq(n),function(x) rnorm(1),numeric(1)))

##           test   elapsed   relative user.self
## 2     rnorm(n)     1.138     13.233     1.140
## 1 rziggurat(n)     0.086      1.000     0.088
## 3  vapply(...)    29.043    337.709    29.046

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM