简体   繁体   中英

Generating sample of integers in Rcpp

I want to create a random vector of 5 integer numbers from range eg: 1:10. I can use ONLY basic Rcpp. (without C libraries)

Currently I have:

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector test(){
NumericVector z(5);
for (int i=0; i<5 ++i)
z[i] = R::runif(1,10);
return z; 
}
/***R
test()
*/

But:

  • it is not integer

  • it is not unique.

This can be done concisely with std::random_shuffle :

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::IntegerVector sample_int() {
    Rcpp::IntegerVector pool = Rcpp::seq(1, 10);
    std::random_shuffle(pool.begin(), pool.end());
    return pool[Rcpp::Range(0, 4)];
} 

Sample output:

sample_int()
# [1] 9 2 5 1 7

sample_int()
# [1]  1 10  5  3  8

sample_int()
# [1] 5 9 3 2 8 

And for the record, you code wasn't returning integers because

  • ::runif returns double values; and
  • Your function's return type was NumericVector rather than IntegerVector

Although it is inconsequential when dealing with small ranges such as the one used in your example (1, ..., 10), this approach is not very efficient (particularly when the number of elements being sampled is much smaller than the drawing pool), as std::random_shuffle shuffles the entire range. With a couple of auxiliary functions, we can do better (assuming std::rand is "sufficiently" random for your purposes):

#include <Rcpp.h>

// C++ 98
template <typename Iter, typename T>
inline void iota(Iter first, Iter last, T value) {
    while (first != last) {
        *first++ = value++;
    }
}

template <typename T>
inline T pop_random(std::vector<T>& v) {
    typename std::vector<T>::size_type pos = std::rand() % v.size();
    T res = v[pos];

    std::swap(v[pos], v.back());
    v.pop_back();

    return res;
}

// [[Rcpp::export]]
Rcpp::IntegerVector sample_int2(int n, int min, int max) {
    Rcpp::IntegerVector res(n);
    std::vector<int> pool(max + 1 - min);
    iota(pool.begin(), pool.end(), min);

    for (R_xlen_t i = 0; i < n; i++) {
        res[i] = pop_random(pool);
    }

    return res;
}

And generalizing the original solution for comparison:

// [[Rcpp::export]]
Rcpp::IntegerVector sample_int(int n, int min, int max) {
    Rcpp::IntegerVector pool = Rcpp::seq(min, max);
    std::random_shuffle(pool.begin(), pool.end());
    return pool[Rcpp::Range(0, n - 1)];
}

microbenchmark::microbenchmark(
    "sample_int" = sample_int(100, 1, 1e6),
    "sample_int2" = sample_int2(100, 1, 1e6),
    times = 300L
)
# Unit: milliseconds
#         expr       min        lq      mean    median        uq       max neval
#   sample_int 20.639801 22.417594 23.603727 22.922765 23.735258 35.531140   300
#  sample_int2  1.504872  1.689987  1.789866  1.755937  1.830249  2.863399   300

microbenchmark::microbenchmark(
    "sample_int" = sample_int(1e5, 1, 1e6),
    "sample_int2" = sample_int2(1e5, 1, 1e6),
    times = 300L
)
# Unit: milliseconds
#         expr      min        lq      mean    median        uq       max neval
#   sample_int 21.08035 22.384714 23.295403 22.811011 23.282353 34.068462   300
#  sample_int2  3.37047  3.761608  3.992875  3.945773  4.086605  9.134516   300

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM