简体   繁体   中英

Returning an R function from rcpp

Is there a way in Rcpp to return an R function with some pre-computed values that are only computed on the first function call? Consider the following R code:

1: func_generator<-function(X) {
2:  X_tot<-sum(X)
3:  function(b_vec) { (X_tot*b_vec) }
4: }
5: myfunc<-func_generator(c(3,4,5))
6: myfunc(1:2)
7: myfunc(5:6)
8: myfunc2<-func_generator(c(10,11,12,13))
...

Can this be programmed in Rcpp? In practice, assume that something more computationally intensive is done in place of line 2.

To add context, given vector X and scalar b, there is some likelihood function f(b|X), which can be reexpressed as f(b,s(X)) for some sufficient statistic s(X) that is a function only of X, and which involves some computation. This is in a computationally intensive computer experiment, with many vectors X (many likelihoods), and many separate calls to f(bvec|X) for each likelihood, so I'd rather compute s(X) once (for each likelihood) and save it in some fashion rather than re-computing it many times. I've started by simply programming f(bvec,X) to evaluate f(b|X) at the points bvec=(b_1,...,b_n), but this has extra overhead since I call this function several times and it computes s(X) on each run. I'd like to just compute s(X) once.

Any suggestions to accomplish this task efficiently in Rcpp would be appreciated (whether via returning a function; or via storing intermediate calculations in some other fashion).

One simple way to store intermediate results would be a static variable at function level:

// [[Rcpp::plugins(cpp11)]]
#include <thread>
#include <chrono>
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector foo(Rcpp::NumericVector X, Rcpp::NumericVector b, bool useCache = true) {
  static double cache;
  static bool initialized{false};
  if (!(useCache && initialized)) {
    // sleep to simulate actual work
    std::this_thread::sleep_for (std::chrono::seconds(1));
    cache = Rcpp::sum(X);
    initialized = true;
  }
  return cache * b;
}

/*** R
X <- 1:10
b <- 10:20

system.time(r1 <- foo(X, b))
system.time(r2 <- foo(X, b))
all.equal(r1, r2)
system.time(r3 <- foo(X, b, FALSE))
all.equal(r1, r3)
*/

Output:

> system.time(r1 <- foo(X, b))
   user  system elapsed 
      0       0       1 

> system.time(r2 <- foo(X, b))
   user  system elapsed 
  0.002   0.000   0.002 

> all.equal(r1, r2)
[1] TRUE

> system.time(r3 <- foo(X, b, FALSE))
   user  system elapsed 
      0       0       1 

> all.equal(r1, r3)
[1] TRUE

When the cache is used in the second function call, the result is computed almost instantaneously.

This approach is efficient if you can loop over the different b within a loop over the different X . If this restriction does not work for you, then you could use the memoise package at the R level to efficiently store the output of your expensive function for arbitrary input:

// [[Rcpp::plugins(cpp11)]]
#include <thread>
#include <chrono>
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector foo(double total, Rcpp::NumericVector b) {
  return total * b;
}

// [[Rcpp::export]]
double bar(Rcpp::NumericVector X) {
  // sleep to simulate actual work
  std::this_thread::sleep_for (std::chrono::seconds(1));
  return Rcpp::sum(X);
}


/*** R
X1 <- 1:10
b1 <- 10:20
X2 <- 10:1
b2 <- 20:10

library(memoise)
bar2 <- memoise(bar)

system.time(r11 <- foo(bar2(X1), b1))
system.time(r21 <- foo(bar2(X2), b2))
system.time(r12 <- foo(bar2(X1), b1))
system.time(r22 <- foo(bar2(X2), b2))
all.equal(r11, r12)
all.equal(r21, r22)
*/

Output:

> system.time(r11 <- foo(bar2(X1), b1))
   user  system elapsed 
  0.001   0.000   1.001 

> system.time(r21 <- foo(bar2(X2), b2))
   user  system elapsed 
  0.033   0.000   1.033 

> system.time(r12 <- foo(bar2(X1), b1))
   user  system elapsed 
      0       0       0 

> system.time(r22 <- foo(bar2(X2), b2))
   user  system elapsed 
      0       0       0 

> all.equal(r11, r12)
[1] TRUE

> all.equal(r21, r22)
[1] TRUE

As an alternative you could also use these two functions as building blocks for your function generator:

func_generator <- function(X) {
  X_tot <- bar(X)
  function(b_vec) { foo(X_tot, b_vec) }
}
myfunc <- func_generator(c(3,4,5))
myfunc2 <- func_generator(c(10,11,12,13))
myfunc(1:2)
myfunc(5:6)
myfunc2(1:2)
myfunc2(5:6)

So keep the numerical expensive work in C++, but keep it simple. The functional aspects can then be added using R.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM