简体   繁体   English

从rcpp返回R函数

[英]Returning an R function from rcpp

Is there a way in Rcpp to return an R function with some pre-computed values that are only computed on the first function call? 在Rcpp中是否有一种方法可以返回一个带有一些预计算值的R函数,这些值只在第一次函数调用时计算出来? Consider the following R code: 考虑以下R代码:

1: func_generator<-function(X) {
2:  X_tot<-sum(X)
3:  function(b_vec) { (X_tot*b_vec) }
4: }
5: myfunc<-func_generator(c(3,4,5))
6: myfunc(1:2)
7: myfunc(5:6)
8: myfunc2<-func_generator(c(10,11,12,13))
...

Can this be programmed in Rcpp? 这可以用Rcpp编程吗? In practice, assume that something more computationally intensive is done in place of line 2. 在实践中,假设代替第2行进行更加计算密集的操作。

To add context, given vector X and scalar b, there is some likelihood function f(b|X), which can be reexpressed as f(b,s(X)) for some sufficient statistic s(X) that is a function only of X, and which involves some computation. 为了添加上下文,给定向量X和标量b,有一些似然函数f(b | X),它可以重新表达为f(b,s(X)),用于某些足够的统计量s(X),它只是一个函数X的,涉及一些计算。 This is in a computationally intensive computer experiment, with many vectors X (many likelihoods), and many separate calls to f(bvec|X) for each likelihood, so I'd rather compute s(X) once (for each likelihood) and save it in some fashion rather than re-computing it many times. 这是一个计算密集型的计算机实验,有许多向量X(很多可能性),并且对于每个可能性有许多单独调用f(bvec | X),所以我宁愿计算s(X)一次(对于每个可能性)和以某种方式保存它而不是多次重新计算它。 I've started by simply programming f(bvec,X) to evaluate f(b|X) at the points bvec=(b_1,...,b_n), but this has extra overhead since I call this function several times and it computes s(X) on each run. 我开始只是简单编程f(bvec,X)来评估点bvec =(b_1,...,b_n)处的f(b | X),但这有额外的开销,因为我多次调用此函数并且它每次运行时计算s(X)。 I'd like to just compute s(X) once. 我想只计算一次s(X)。

Any suggestions to accomplish this task efficiently in Rcpp would be appreciated (whether via returning a function; or via storing intermediate calculations in some other fashion). 任何有关在Rcpp中有效完成此任务的建议都将受到赞赏(无论是通过返回函数;还是通过以某种其他方式存储中间计算)。

One simple way to store intermediate results would be a static variable at function level: 存储中间结果的一种简单方法是在函数级别的静态变量:

// [[Rcpp::plugins(cpp11)]]
#include <thread>
#include <chrono>
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector foo(Rcpp::NumericVector X, Rcpp::NumericVector b, bool useCache = true) {
  static double cache;
  static bool initialized{false};
  if (!(useCache && initialized)) {
    // sleep to simulate actual work
    std::this_thread::sleep_for (std::chrono::seconds(1));
    cache = Rcpp::sum(X);
    initialized = true;
  }
  return cache * b;
}

/*** R
X <- 1:10
b <- 10:20

system.time(r1 <- foo(X, b))
system.time(r2 <- foo(X, b))
all.equal(r1, r2)
system.time(r3 <- foo(X, b, FALSE))
all.equal(r1, r3)
*/

Output: 输出:

> system.time(r1 <- foo(X, b))
   user  system elapsed 
      0       0       1 

> system.time(r2 <- foo(X, b))
   user  system elapsed 
  0.002   0.000   0.002 

> all.equal(r1, r2)
[1] TRUE

> system.time(r3 <- foo(X, b, FALSE))
   user  system elapsed 
      0       0       1 

> all.equal(r1, r3)
[1] TRUE

When the cache is used in the second function call, the result is computed almost instantaneously. 当在第二个函数调用中使用缓存时,几乎立即计算结果。

This approach is efficient if you can loop over the different b within a loop over the different X . 如果您可以在不同X的循环内循环不同的b则此方法很有效。 If this restriction does not work for you, then you could use the memoise package at the R level to efficiently store the output of your expensive function for arbitrary input: 如果此限制对您不起作用,那么您可以使用R级别的memoise包来有效地存储昂贵函数的输出以进行任意输入:

// [[Rcpp::plugins(cpp11)]]
#include <thread>
#include <chrono>
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector foo(double total, Rcpp::NumericVector b) {
  return total * b;
}

// [[Rcpp::export]]
double bar(Rcpp::NumericVector X) {
  // sleep to simulate actual work
  std::this_thread::sleep_for (std::chrono::seconds(1));
  return Rcpp::sum(X);
}


/*** R
X1 <- 1:10
b1 <- 10:20
X2 <- 10:1
b2 <- 20:10

library(memoise)
bar2 <- memoise(bar)

system.time(r11 <- foo(bar2(X1), b1))
system.time(r21 <- foo(bar2(X2), b2))
system.time(r12 <- foo(bar2(X1), b1))
system.time(r22 <- foo(bar2(X2), b2))
all.equal(r11, r12)
all.equal(r21, r22)
*/

Output: 输出:

> system.time(r11 <- foo(bar2(X1), b1))
   user  system elapsed 
  0.001   0.000   1.001 

> system.time(r21 <- foo(bar2(X2), b2))
   user  system elapsed 
  0.033   0.000   1.033 

> system.time(r12 <- foo(bar2(X1), b1))
   user  system elapsed 
      0       0       0 

> system.time(r22 <- foo(bar2(X2), b2))
   user  system elapsed 
      0       0       0 

> all.equal(r11, r12)
[1] TRUE

> all.equal(r21, r22)
[1] TRUE

As an alternative you could also use these two functions as building blocks for your function generator: 作为替代方案,您还可以将这两个函数用作函数生成器的构建块:

func_generator <- function(X) {
  X_tot <- bar(X)
  function(b_vec) { foo(X_tot, b_vec) }
}
myfunc <- func_generator(c(3,4,5))
myfunc2 <- func_generator(c(10,11,12,13))
myfunc(1:2)
myfunc(5:6)
myfunc2(1:2)
myfunc2(5:6)

So keep the numerical expensive work in C++, but keep it simple. 因此,在C ++中保留数字昂贵的工作,但要保持简单。 The functional aspects can then be added using R. 然后可以使用R添加功能方面。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM