使用C ++ 11生成高效的随机数<random>

Question

I am trying to understand how the C++11 random number generation features are meant to be used. 我试图了解如何使用C ++ 11随机数生成功能。 My concern is performance. 我担心的是表现。

Suppose that we need to generate a series of random integers between 0..k , but k changes at every step. 假设我们需要在0..k之间生成一系列随机整数，但k在每一步都会发生变化。 What is the best way to proceed? 什么是最好的方法？

Example: 例：

for (int i=0; i < n; ++i) {
    int k = i; // of course this is more complicated in practice
    std::uniform_int_distribution<> dist(0, k);
    int random_number = dist(engine);
    // do something with random number
}

The distributions that the <random> header provides are very convenient. <random>标头提供的分发非常方便。 But they are opaque to the user, so I cannot easily predict how they will perform. 但它们对用户来说是不透明的，因此我无法轻易预测它们的表现。 It is not clear for example how much (if any) runtime overhead will be caused by the construction of dist above. 例如，不清楚上面构造的dist会导致多少（如果有的话）运行时开销。

Instead I could have used something like 相反，我可以使用类似的东西

std::uniform_real_distribution<> dist(0.0, 1.0);
for (int i=0; i < n; ++i) {
    int k = i; // of course this is more complicated in practice
    int random_number = std::floor( (k+1)*dist(engine) );
    // do something with random number
}

which avoids constructing a new object in each iteration. 这避免了在每次迭代中构造一个新对象。

Random numbers are often used in numerical simulations where performance is important. 随机数通常用于性能很重要的数值模拟中。 What is the best way to use <random> in these situations? 在这些情况下使用<random>的最佳方法是什么？

Please do no answer "profile it". 请不要回答“简介”。 Profiling is part of effective optimization, but so is a good understanding of how a library is meant to be used and the performance characteristics of that library. 分析是有效优化的一部分，但是对如何使用库以及该库的性能特征有很好的理解。 If the answer is that it depends on the standard library implementation, or that the only way to know is to profile it, then I would rather not use the distributions from <random> at all. 如果答案是它依赖于标准库实现，或者知道的唯一方法是对其进行分析，那么我宁愿不使用<random>的分发。 Instead I can use my own implementation which will be transparent to me and much easier to optimize if/when necessary. 相反，我可以使用我自己的实现，这对我来说是透明的，并且在必要时更容易优化。

Answer 1

One thing you can do is to have a permanent distribution object so that you only create the param_type object each time like this: 你可以做的一件事就是拥有一个永久的分发对象，这样你每次只能创建一个param_type对象：

template<typename Integral>
Integral randint(Integral min, Integral max)
{
    using param_type =
        typename std::uniform_int_distribution<Integral>::param_type;

    // only create these once (per thread)
    thread_local static std::mt19937 eng {std::random_device{}()};
    thread_local static std::uniform_int_distribution<Integral> dist;

    // presumably a param_type is cheaper than a uniform_int_distribution
    return dist(eng, param_type{min, max});
}

Answer 2

For maximizing performance, first of all consider different PRNG, such as xorshift128+ . 为了最大化性能，首先要考虑不同的PRNG，例如xorshift128 + 。 It has been reported being more than twice as fast as mt19937 for 64-bit random numbers; 据报道，对于64位随机数，它的速度是mt19937两倍多; see http://xorshift.di.unimi.it/ . 见http://xorshift.di.unimi.it/ 。 And it can be implemented with a few lines of code. 它可以用几行代码实现。

Moreover, if you don't need "perfectly balanced" uniform distribution and your k is much less than 2^64 (which likely is), I would suggest to write simply something as: 此外，如果你不需要“完全平衡”的均匀分布，并且你的k远小于2^64 （可能是），我建议只写一些东西：

uint64_t temp = engine_64(); // generates 0 <= temp < 2^64
int random_number = temp % (k + 1); // crop temp to 0,...,k

Note, however, that integer division/modulo operations are not cheap. 但请注意，整数除法/模运算并不便宜。 For example, on an Intel Haswell processor, they take 39-103 processor cycles for 64-bit numbers, which is likely much longer than calling an MT19937 or xorshift+ engine. 例如，在Intel Haswell处理器上，64位数字需要39-103个处理器周期，这可能比调用MT19937或xorshift +引擎要长得多。

使用C ++ 11生成高效的随机数<random>

问题描述

2 个解决方案

解决方案1
6 已采纳 2016-03-10 11:33:57

解决方案2
2 2016-03-10 15:22:20

使用C ++ 11生成高效的随机数<random>

问题描述

2 个解决方案

解决方案1 6 已采纳 2016-03-10 11:33:57

解决方案2 2 2016-03-10 15:22:20

解决方案1
6 已采纳 2016-03-10 11:33:57

解决方案2
2 2016-03-10 15:22:20