Vectorised Rcpp随机二项式绘制

Question

This is a follow-on question from this one: Generating same random variable in Rcpp and R 这是这个问题的后续问题：在Rcpp和R中生成相同的随机变量

I'm trying to speed up a vectorised call to rbinom of this form: 我正在尝试加速对这种形式的rbinom的矢量化调用：

    x <- c(0.1,0.4,0.6,0.7,0.8)
    rbinom(length(x),1 ,x)

In the live code of x is a vector of variable length (but typically numbering in the millions). 在x的实时代码中是一个可变长度的向量（但通常以百万为单位编号）。 I have no experience with Rcpp but I was wondering could I use Rcpp to speed this up. 我没有Rcpp的经验，但我想知道我可以使用Rcpp来加快速度。 From the linked question this Rcpp code was suggested for non-vectorised rbinom calls by @Dirk Eddelbuettel : 从链接的问题来看，这个Rcpp代码被建议用于@Dirk Eddelbuettel的非矢量化rbinom调用：

    cppFunction("NumericVector cpprbinom(int n, double size, double prob) { \
         return(rbinom(n, size, prob)); }")
    set.seed(42); cpprbinom(10, 1, 0.5)

....and is about twice as fast as the non Rcpp option, but can't handle my vectorised version ....并且大约是非Rcpp选项的两倍，但无法处理我的矢量化版本

    cpprbinom(length(x), 1, x)

How can the Rcpp code be modified to implement this? 如何修改Rcpp代码来实现这一点？

Thanks 谢谢

Answer 1

Following Dirk's response here : 继德克的回答在这里：

Is there a way of fixing the code without using an explicit loop in the C++ code? 有没有办法在不使用C ++代码中的显式循环的情况下修复代码？

I don't think so. 我不这么认为。 The code currently has this hard-wired: <...> so until one of us has sufficient [time] to extend this (and test it) will have to do the loop at your end. 代码目前有这样的硬连接：<...>所以，直到我们中的一个人有足够的[时间]来扩展它（并测试它）将不得不在你的结束处进行循环。

Here's my implementation of a "vectorised" code: 这是我对“矢量化”代码的实现：

library(Rcpp)
cppFunction("NumericVector cpprbinom(int n, double size, NumericVector prob) { 
    NumericVector v(n);            
    for (int i=0; i<n; i++) {v[i] = as<double>(rbinom(1, size, prob[i]));} 
    return(v); }")
r <- runif(1e6)
all.equal({set.seed(42); rbinom(length(r), 1, r)}, 
          {set.seed(42); cpprbinom(length(r), 1, r)})
#TRUE

But the problem is (again citing Dirk), 但问题是（再次引用德克），

And I suggest that before expending a lot of effort on this you check whether you are likely to do better than the R function rbinom. 我建议在花费大量精力之前，先检查一下你是否可能比R函数rbinom更好。 That R function is vectorized in C code and you are unlikely to make things much faster by using Rcpp, unless you want to use the random variates in another C++ function. R函数在C代码中进行了矢量化，除非你想在另一个C ++函数中使用随机变量，否则你不太可能通过使用Rcpp来加快速度。

And it is actually slower (x3 on my machine), so at least such naive implementation as mine won't help: 它实际上更慢（我机器上的x3），所以至少这样天真的实现不会有帮助：

library(microbenchmark)
microbenchmark(rbinom(length(r), 1, r), cpprbinom(length(r), 1, r))

Unit: milliseconds
                       expr       min        lq      mean    median        uq       max neval
    rbinom(length(r), 1, r)  55.50856  56.09292  56.49456  56.45297  56.65897  59.42524   100
 cpprbinom(length(r), 1, r) 117.63761 153.37599 154.94164 154.29623 155.37247 225.56535   100

EDIT: according to Romain's comment below, here's an advanced version, which is faster! 编辑：根据Romain的评论，这里是一个高级版本，速度更快！

cppFunction(plugins=c("cpp11"), "NumericVector cpprbinom2(int n, double size, NumericVector prob) { 
    NumericVector v = no_init(n);
    std::transform( prob.begin(), prob.end(), v.begin(), [=](double p){ return R::rbinom(size, p); }); 
    return(v);}")
r <- runif(1e6)
all.equal({set.seed(42); rbinom(length(r), 1, r)}, 
          {set.seed(42); cpprbinom(length(r), 1, r)}, 
          {set.seed(42); cpprbinom2(length(r), 1, r)})
#TRUE
microbenchmark(rbinom(length(r), 1, r), cpprbinom(length(r), 1, r), cpprbinom2(length(r), 1, r))

Unit: milliseconds
                        expr       min        lq      mean    median        uq       max neval
     rbinom(length(r), 1, r)  55.26412  56.00314  56.57814  56.28616  56.59561  60.01861   100
  cpprbinom(length(r), 1, r) 113.72513 115.94758 122.81545 117.24708 119.95134 168.47246   100
 cpprbinom2(length(r), 1, r)  36.67589  37.12182  38.95318  37.37436  37.97719  84.73516   100

Answer 2

Not a general solution, but I'm noticing that you set the size argument to 1 in your call to rbinom . 不是一般解决方案，但我注意到你在调用rbinom将size参数设置为1。 If that's always the case, you can draw length(x) uniform values and then comparing to x . 如果情况总是这样，您可以绘制length(x)均匀值，然后与x进行比较。 For instance: 例如：

 set.seed(123)
 #create the values
 x<-runif(1000000)
 system.time(res<-rbinom(length(x),1 ,x))   
 # user  system elapsed 
 #0.068   0.000   0.070
 system.time(res2<-as.integer(runif(length(x))<x))   
 # user  system elapsed 
 #0.044   0.000   0.046

Not a huge gain, but maybe you can save some little time if you call runif from C++, avoiding some overhead. 这不是一个巨大的收获，但如果从C ++调用runif ，可能会节省一些时间，避免一些开销。

Vectorised Rcpp随机二项式绘制

问题描述

2 个解决方案

解决方案1
7 已采纳 2015-04-03 12:18:25

解决方案2
4 2015-04-03 12:31:47

Vectorised Rcpp随机二项式绘制

问题描述

2 个解决方案

解决方案1 7 已采纳 2015-04-03 12:18:25

解决方案2 4 2015-04-03 12:31:47

解决方案1
7 已采纳 2015-04-03 12:18:25

解决方案2
4 2015-04-03 12:31:47