有效地在Rcpp中生成随机位流

Question

I have an auxiliary function in the R package I'm currently building named rbinom01 . 我正在构建的R包中有一个辅助功能，名为rbinom01 。 Note that it calls random(3) . 请注意，它调用random(3) 。

int rbinom01(int size) {
  if (!size) {
    return 0;
  }

  int64_t result = 0;
  while (size >= 32) {
    result += __builtin_popcount(random());
    size -= 32;
  }

  result += __builtin_popcount(random() & ~(LONG_MAX << size));

  return result;
}

When R CMD check my_package , I got the following warning: 当R CMD check my_package ，出现以下警告：

* checking compiled code ... NOTE
File ‘ my_package/libs/my_package.so’:
  Found ‘_random’, possibly from ‘random’ (C)
    Object: ‘ my_function.o’

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs.

See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.

I headed to the Document , and it says I can use one of the *_rand function, along with a family of distribution functions . 我前往文档，它说我可以使用*_rand函数之一，以及一系列分发函数。 Well that's cool, but my package simply needs a stream of random bits rather than a random double . 很好，但是我的包只需要一个随机位流，而不是一个随机double 。 The easiest way I can have it is by using random(3) or maybe reading from /dev/urandom , but that makes my package "unportable". 我拥有它的最简单方法是使用random(3)或从/dev/urandom读取，但这会使我的程序包“无法携带”。

This post suggests using sample , but unfortunately it doesn't fit into my use case. 这篇文章建议使用sample ，但是不幸的是，它不适合我的用例。 For my application, generating random bits is apparently critical to the performance, so I don't want it waste any time calling unif_rand , multiply the result by N and round it. 对于我的应用程序，生成随机位显然对性能至关重要，因此我不希望浪费时间调用unif_rand ，将结果乘以N并四舍五入。 Anyway, the reason I'm using C++ is to exploit bit-level parallelism. 无论如何，我使用C ++的原因是为了利用位级并行性。

Surely I can hand-roll my own PRNG or copy and paste the code of a state-of-the-art PRNG like xoshiro256** , but before doing that I would like to see if there are any easier alternatives. 当然，我可以手动滚动自己的PRNG或复制并粘贴最先进的PRNG的代码，例如xoshiro256 ** ，但是在此之前，我想看看是否有更简单的替代方法。

Incidentally, could someone please link a nice short tutorial of Rcpp to me? 顺便说一句，有人可以将Rcpp的简短教程链接到我吗？ Writing R Extensions is comprehensive and awesome but it would take me weeks to finish. 编写R Extensions非常全面而且很棒，但是我可能要花几周的时间才能完成。 I'm looking for a more concise version, but preferably it should be more informative than a call to Rcpp.package.skeleton . 我正在寻找一个更简洁的版本，但最好它应该比对Rcpp.package.skeleton的调用提供更多信息。

As suggested by @Ralf Stubner 's answer, I have re-wrote the original code as follow. 如@Ralf Stubner的答案所建议，我重新编写了原始代码，如下所示。 However, I'm getting the same result every time. 但是，每次都得到相同的结果。 How can I seed it properly and at the same time keep my code "portable"? 如何正确植入种子，同时保持代码“可移植”？

int rbinom01(int size) {
  dqrng::xoshiro256plus rng;

  if (!size) {
    return 0;
  }

  int result = 0;
  while (size >= 64) {
    result += __builtin_popcountll(rng());
    Rcout << sizeof(rng()) << std::endl;
    size -= 64;
  }

  result += __builtin_popcountll(rng() & ((1LLU << size) - 1));

  return result;
}

Answer 1

There are different R packages that make PRNGs available as C++ header only libraries: 有不同的R包使PRNG可用作仅C ++头文件库：

BH : Everything from boost.random BH ：一切都来自boost.random
sitmo : Various Threefry versions sitmo ：各种Threefry版本
dqrng : PCG family, xoshiro256+ and xoroshiro128+ dqrng ：PCG系列，xoshiro256 +和xoroshiro128 +
... ...

You can make use of any of these by adding LinkingTo to your package's DECRIPTION . 您可以通过将LinkingTo添加到包的DECRIPTION来使用其中的任何一个。 Typically these PRNGs are modeled after the C++11 random header, which means you have to control their life-cycle and seeding yourself. 通常，这些PRNG是根据C ++ 11 random标头建模的，这意味着您必须控制它们的生命周期并自行设定种子。 In a single-threaded environment I like to use anonymous namespaces for life-cycle control, eg: 在单线程环境中，我喜欢使用匿名名称空间进行生命周期控制，例如：

#include <Rcpp.h>
// [[Rcpp::depends(dqrng)]]
#include <xoshiro.h>
// [[Rcpp::plugins(cpp11)]]

namespace {
dqrng::xoshiro256plus rng{};
}

// [[Rcpp::export]]
void set_seed(int seed) {
  rng.seed(seed);
}

// [[Rcpp::export]]
int rbinom01(int size) {
  if (!size) {
    return 0;
  }

  int result = 0;
  while (size >= 64) {
    result += __builtin_popcountll(rng());
    size -= 64;
  }

  result += __builtin_popcountll(rng() & ((1LLU << size) - 1));

  return result;
}

/*** R
set_seed(42)
rbinom01(10)
rbinom01(10)
rbinom01(10)
*/

However, using runif isn't all bad and certainly faster than accessing /dev/urandom . 但是，使用runif并不一定比访问/dev/urandom更快，而且速度更快。 In dqrng there is a convenient wrapper for this. 在dqrng有一个方便的包装器。

As for tutorials: Besides WRE the Rcpp package vignette is a must read. 至于教程：除WRE之外， Rcpp软件包插图也是必读的。 R Packages by Hadley Wickham also has a chapter on "compiled code" if you want to go the devtools -way. 如果您想使用devtools -way，Hadley Wickham撰写的R Packages也有一章“编译的代码”。

有效地在Rcpp中生成随机位流

问题描述

1 个解决方案

解决方案1
4 已采纳 2019-03-28 14:09:17

有效地在Rcpp中生成随机位流

问题描述

1 个解决方案

解决方案1 4 已采纳 2019-03-28 14:09:17

解决方案1
4 已采纳 2019-03-28 14:09:17