简体   繁体   English

在Rcpp中生成整数样本

[英]Generating sample of integers in Rcpp

I want to create a random vector of 5 integer numbers from range eg: 1:10. 我想创建一个5个整数的随机向量,范围例如:1:10。 I can use ONLY basic Rcpp. 我只能使用基本的Rcpp。 (without C libraries) (没有C库)

Currently I have: 目前我有:

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector test(){
NumericVector z(5);
for (int i=0; i<5 ++i)
z[i] = R::runif(1,10);
return z; 
}
/***R
test()
*/

But: 但:

  • it is not integer 它不是整数

  • it is not unique. 它不是唯一的。

This can be done concisely with std::random_shuffle : 这可以用std::random_shuffle简洁地完成:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::IntegerVector sample_int() {
    Rcpp::IntegerVector pool = Rcpp::seq(1, 10);
    std::random_shuffle(pool.begin(), pool.end());
    return pool[Rcpp::Range(0, 4)];
} 

Sample output: 样本输出:

sample_int()
# [1] 9 2 5 1 7

sample_int()
# [1]  1 10  5  3  8

sample_int()
# [1] 5 9 3 2 8 

And for the record, you code wasn't returning integers because 而对于记录,你的代码并没有返回整数,因为

  • ::runif returns double values; ::runif返回double值; and
  • Your function's return type was NumericVector rather than IntegerVector 函数的返回类型是NumericVector而不是IntegerVector

Although it is inconsequential when dealing with small ranges such as the one used in your example (1, ..., 10), this approach is not very efficient (particularly when the number of elements being sampled is much smaller than the drawing pool), as std::random_shuffle shuffles the entire range. 虽然在处理小范围(如示例中使用的范围(1,...,10)时无关紧要,但这种方法效率不高(特别是当采样的元素数量远远小于绘图池时) ,因为std::random_shuffle整个范围。 With a couple of auxiliary functions, we can do better (assuming std::rand is "sufficiently" random for your purposes): 有了几个辅助函数,我们可以做得更好(假设std::rand为了你的目的“足够”随机):

#include <Rcpp.h>

// C++ 98
template <typename Iter, typename T>
inline void iota(Iter first, Iter last, T value) {
    while (first != last) {
        *first++ = value++;
    }
}

template <typename T>
inline T pop_random(std::vector<T>& v) {
    typename std::vector<T>::size_type pos = std::rand() % v.size();
    T res = v[pos];

    std::swap(v[pos], v.back());
    v.pop_back();

    return res;
}

// [[Rcpp::export]]
Rcpp::IntegerVector sample_int2(int n, int min, int max) {
    Rcpp::IntegerVector res(n);
    std::vector<int> pool(max + 1 - min);
    iota(pool.begin(), pool.end(), min);

    for (R_xlen_t i = 0; i < n; i++) {
        res[i] = pop_random(pool);
    }

    return res;
}

And generalizing the original solution for comparison: 并概括原始解决方案以进行比较:

// [[Rcpp::export]]
Rcpp::IntegerVector sample_int(int n, int min, int max) {
    Rcpp::IntegerVector pool = Rcpp::seq(min, max);
    std::random_shuffle(pool.begin(), pool.end());
    return pool[Rcpp::Range(0, n - 1)];
}

microbenchmark::microbenchmark(
    "sample_int" = sample_int(100, 1, 1e6),
    "sample_int2" = sample_int2(100, 1, 1e6),
    times = 300L
)
# Unit: milliseconds
#         expr       min        lq      mean    median        uq       max neval
#   sample_int 20.639801 22.417594 23.603727 22.922765 23.735258 35.531140   300
#  sample_int2  1.504872  1.689987  1.789866  1.755937  1.830249  2.863399   300

microbenchmark::microbenchmark(
    "sample_int" = sample_int(1e5, 1, 1e6),
    "sample_int2" = sample_int2(1e5, 1, 1e6),
    times = 300L
)
# Unit: milliseconds
#         expr      min        lq      mean    median        uq       max neval
#   sample_int 21.08035 22.384714 23.295403 22.811011 23.282353 34.068462   300
#  sample_int2  3.37047  3.761608  3.992875  3.945773  4.086605  9.134516   300

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM