为什么此Rcpp代码比字节编译的R慢？

Question

As the question title says, I'd like to know why the byte compiled R code (using compiler::cmpfun ) is faster than equivalent Rcpp code for the following mathematical function: 正如问题标题所述，对于以下数学函数，我想知道为什么字节编译的R代码（使用compiler::cmpfun ）比等效的Rcpp代码要快：

func1 <- function(alpha, tau, rho, phi) {
     abs((alpha + 1)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha)))
}

Since this is a simple numerical operation, I would have expected Rcpp ( funcCpp and funcCpp2 ) to be much faster than the byte compiled R ( func1c and func2c ), especially since R would have more overhead for storing (1+alpha)**tau or needs to recompute it. 因为这是一个简单的数值运算，所以我希望Rcpp（ funcCpp和funcCpp2 ）比字节编译的R（ func1c和func2c ） func2c ，尤其是因为R在存储(1+alpha)**tau会有更多开销。或需要重新计算。 In fact computing this exponent two times seems faster than the memory allocation in R ( func1c vs func2c ), which seems especially counterintuitive, since n is large. 实际上，计算此指数两次似乎比R中的内存分配快（ func1c vs func2c ），这似乎特别违反直觉，因为n很大。 My other guess is that maybe compiler::cmpfun is pulling off some magic, but I'd like to know if that is indeed the case. 我的另一个猜测是，也许compiler::cmpfun正在compiler::cmpfun一些魔力，但我想知道是否确实如此。

So really, the two things I'd like to know are: 所以说真的，我想知道的两件事是：

Why are funcCpp and funcCpp2 slower than func1c and func2c? 为什么funcCpp和funcCpp2比func1c和func2c慢？ (Rcpp slower than compiled R functions) （Rcpp比编译的R函数慢）
Why is funcCpp slower than func2? 为什么funcCpp比func2慢？ (Rcpp code slower than pure R) （Rcpp代码比纯R慢）

FWIW, here's my C++ and R version data FWIW，这是我的C ++和R版本数据

user% g++ --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.0 (clang-700.0.72)
Target: x86_64-apple-darwin14.3.0
Thread model: posix

user% R --version
R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin14.5.0 (64-bit)

And here's the R and Rcpp code: 这是R和Rcpp代码：

library(Rcpp)
library(rbenchmark)

func1 <- function(alpha, tau, rho, phi) {
    abs((1 + alpha)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha)))
}

func2 <- function(alpha, tau, rho, phi) {
    pval <- (alpha + 1)^(tau)
    abs( pval * phi - rho * (1- pval)/(1 - (1 + alpha)))
}

func1c <- compiler::cmpfun(func1)
func2c <- compiler::cmpfun(func2)

func3c <- Rcpp::cppFunction('
    double funcCpp(double alpha, int tau, double rho, double phi) {
        double pow_val = std::exp(tau * std::log(alpha + 1.0));
        double pAg = rho/alpha;
        return std::abs(pow_val * (phi -  pAg) + pAg);
    }')

func4c <- Rcpp::cppFunction('
    double funcCpp2(double alpha, int tau, double rho, double phi) {
        double pow_val = pow(alpha + 1.0, tau) ;
        double pAg = rho/alpha;
        return std::abs(pow_val * (phi -  pAg) + pAg);
    }')

res <- benchmark(
           func1(0.01, 200, 100, 1000000),
           func1c(0.01, 200, 100, 1000000),
           func2(0.01, 200, 100, 1000000),
           func2c(0.01, 200, 100, 1000000),
           func3c(0.01, 200, 100, 1000000),
           func4c(0.01, 200, 100, 1000000),
           funcCpp(0.01, 200, 100, 1000000),
           funcCpp2(0.01, 200, 100, 1000000),
           replications = 100000,
           order='relative',
           columns=c("test", "replications", "elapsed", "relative"))

And here's the output of rbenchmark : 这是rbenchmark的输出：

                             test replications elapsed relative
   func1c(0.01, 200, 100, 1e+06)       100000   0.349    1.000
   func2c(0.01, 200, 100, 1e+06)       100000   0.372    1.066
 funcCpp2(0.01, 200, 100, 1e+06)       100000   0.483    1.384
   func4c(0.01, 200, 100, 1e+06)       100000   0.509    1.458
    func2(0.01, 200, 100, 1e+06)       100000   0.510    1.461
  funcCpp(0.01, 200, 100, 1e+06)       100000   0.524    1.501
   func3c(0.01, 200, 100, 1e+06)       100000   0.546    1.564
    func1(0.01, 200, 100, 1e+06)       100000   0.549    1.573K

Answer 1

This is essentially an ill-posed question. 这本质上是一个不适的问题。 When you posit 当你摆姿势

func1 <- function(alpha, tau, rho, phi) {
     abs((alpha + 1)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha)))
}

without even specifying what the arguments are (ie scalar? vector? big? small? memory overhead) then you may in the best case just get a small set of (base, efficient) function calls directly from the parsed expression. 甚至不指定参数是什么（即标量，向量，大，小，内存开销），那么在最佳情况下，您可能只是直接从已解析的表达式中获得了少量的（基本，有效）函数调用。

And ever since we've had the byte compiler, which was since improved by Luke Tierney in subsequent R releases, we have known that it does algebraic expressions well. 自从有了字节编译器以来，卢克·蒂尔尼（Luke Tierney）在随后的R版本中对其进行了改进，我们知道它的代数表达式很好。

Now, compiled C/C++ code does that well too -- but there will be overhead in calling the compiled coed and what you see here is that for "rtivial enough" problems, the overhead does not really get amortized. 现在，已编译的C / C ++代码也能很好地完成工作-但调用已编译的coed会产生开销，并且您在这里看到的是，对于“足够诱人”的问题，开销不会真正摊销。

So you end up with pretty much a draw. 因此，您最终会获得平局。 Not surprise as far as I can tell. 据我所知并不奇怪。

为什么此Rcpp代码比字节编译的R慢？

问题描述

1 个解决方案

解决方案1
5 已采纳 2015-10-16 14:23:02

为什么此Rcpp代码比字节编译的R慢？

问题描述

1 个解决方案

解决方案1 5 已采纳 2015-10-16 14:23:02

解决方案1
5 已采纳 2015-10-16 14:23:02