简体   繁体   English

基于随机比特流生成随机浮点值

[英]Generating random floating-point values based on random bit stream

Given a random source (a generator of random bit stream), how do I generate a uniformly distributed random floating-point value in a given range? 给定随机源(随机比特流的生成器),如何在给定范围内生成均匀分布的随机浮点值?

Assume that my random source looks something like: 假设我的随机源看起来像:

unsigned int GetRandomBits(char* pBuf, int nLen);

And I want to implement 我想实施

double GetRandomVal(double fMin, double fMax);

Notes: 笔记:

  • I don't want the result precision to be limited (for example only 5 digits). 我不希望限制结果精度(例如只有5位数)。
  • Strict uniform distribution is a must 严格的统一分配是必须的
  • I'm not asking for a reference to an existing library. 我不是要求提供对现有库的引用。 I want to know how to implement it from scratch. 我想知道如何从头开始实现它。
  • For pseudo-code / code, C++ would be most appreciated 对于伪代码/代码,C ++将是最受欢迎的

I don't think I'll ever be convinced that you actually need this, but it was fun to write. 我不认为我真的会相信你真的需要这个,但写作很有趣。

#include <stdint.h>

#include <cmath>
#include <cstdio>

FILE* devurandom;

bool geometric(int x) {
  // returns true with probability min(2^-x, 1)
  if (x <= 0) return true;
  while (1) {
    uint8_t r;
    fread(&r, sizeof r, 1, devurandom);
    if (x < 8) {
      return (r & ((1 << x) - 1)) == 0;
    } else if (r != 0) {
      return false;
    }
    x -= 8;
  }
}

double uniform(double a, double b) {
  // requires IEEE doubles and 0.0 < a < b < inf and a normal
  // implicitly computes a uniform random real y in [a, b)
  // and returns the greatest double x such that x <= y
  union {
    double f;
    uint64_t u;
  } convert;
  convert.f = a;
  uint64_t a_bits = convert.u;
  convert.f = b;
  uint64_t b_bits = convert.u;
  uint64_t mask = b_bits - a_bits;
  mask |= mask >> 1;
  mask |= mask >> 2;
  mask |= mask >> 4;
  mask |= mask >> 8;
  mask |= mask >> 16;
  mask |= mask >> 32;
  int b_exp;
  frexp(b, &b_exp);
  while (1) {
    // sample uniform x_bits in [a_bits, b_bits)
    uint64_t x_bits;
    fread(&x_bits, sizeof x_bits, 1, devurandom);
    x_bits &= mask;
    x_bits += a_bits;
    if (x_bits >= b_bits) continue;
    double x;
    convert.u = x_bits;
    x = convert.f;
    // accept x with probability proportional to 2^x_exp
    int x_exp;
    frexp(x, &x_exp);
    if (geometric(b_exp - x_exp)) return x;
  }
}

int main() {
  devurandom = fopen("/dev/urandom", "r");
  for (int i = 0; i < 100000; ++i) {
    printf("%.17g\n", uniform(1.0 - 1e-15, 1.0 + 1e-15));
  }
}

Here is one way of doing it. 这是一种做法。

The IEEE Std 754 double format is as follows: IEEE Std 754双格式如下:

[s][     e     ][                          f                         ]

where s is the sign bit (1 bit), e is the biased exponent (11 bits) and f is the fraction (52 bits). 其中s是符号位(1位),e是偏置指数(11位),f是小数(52位)。

Beware that the layout in memory will be different on little-endian machines. 请注意,在小端机器上,内存中的布局会有所不同。

For 0 < e < 2047, the number represented is 对于0 <e <2047,表示的数字是

(-1)**(s)   *  2**(e – 1023)  *  (1.f)

By setting s to 0, e to 1023 and f to 52 random bits from your bit stream, you get a random double in the interval [1.0, 2.0). 通过将s设置为0,e到1023以及从比特流中选择f到52个随机位,您将在区间[1.0,2.0]中获得随机加倍。 This interval is unique in that it contains 2 ** 52 doubles, and these doubles are equidistant. 这个区间是独特的,因为它包含2 ** 52个双打,并且这些双精度是等距的。 If you then subtract 1.0 from the constructed double, you get a random double in the interval [0.0, 1.0). 如果然后从构造的double中减去1.0,则在区间[0.0,1.0]中得到一个随机双精度数。 Moreover, the property about being equidistant is preserve. 而且,保持等距的属性。 From there you should be able to scale and translate as needed. 从那里你应该能够根据需要进行缩放和翻译。

I'm surprised that for question this old, nobody had actual code for the best answer. 我很惊讶,对于这个问题,没有人有真正的代码来获得最佳答案。 User515430's answer got it right--you can take advantage of IEEE-754 double format to directly put 52 bits into a double with no math at all. User515430的答案是正确的 - 你可以利用IEEE-754双格式直接将52位放入一个没有数学运算的双位。 But he didn't give code. 但他没有给出代码。 So here it is, from my public domain ojrandlib : 所以在这里,从我的公共领域ojrandlib

double ojr_next_double(ojr_generator *g) {
    uint64_t r = (OJR_NEXT64(g) & 0xFFFFFFFFFFFFFull) | 0x3FF0000000000000ull;
    return *(double *)(&r) - 1.0;
}

NEXT64() gets a 64-bit random number. NEXT64()获得一个64位随机数。 If you have a more efficient way of getting only 52 bits, use that instead. 如果你有一个更有效的方法只获得52位,请使用它。

This is easy, as long as you have an integer type with as many bits of precision as a double . 这很简单,只要你有一个整数类型,其精度与double For instance, an IEEE double-precision number has 53 bits of precision, so a 64-bit integer type is enough: 例如,IEEE双精度数具有53位精度,因此64位整数类型就足够了:

#include <limits.h>
double GetRandomVal(double fMin, double fMax) {
  unsigned long long n ;
  GetRandomBits ((char*)&n, sizeof(n)) ;
  return fMin + (n * (fMax - fMin))/ULLONG_MAX ;
}

This is probably not the answer you want, but the specification here: 这可能不是你想要的答案,但这里的规范:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf

in sections [rand.util.canonical] and [rand.dist.uni.real], contains sufficient information to implement what you want, though with slightly different syntax. 在[rand.util.canonical]和[rand.dist.uni.real]中,包含足够的信息来实现您想要的内容,但语法略有不同。 It isn't easy, but it is possible. 这并不容易,但有可能。 I speak from personal experience. 我是根据个人经验说的。 A year ago I knew nothing about random numbers, and I was able to do it. 一年前,我对随机数字一无所知,我能够做到。 Though it took me a while... :-) 虽然我花了一段时间...... :-)

The question is ill-posed. 这个问题是不合适的。 What does uniform distribution over floats even mean? 浮标上的均匀分布甚至意味着什么?

Taking our cue from discrepancy , one way to operationalize your question is to define that you want the distribution that minimizes the following value: 差异中得出提示,实现问题的一种方法是定义您希望最小化以下值的分布:

\\ int_ {t = fmin} ^ {fmax} \\ left(p \\ left(x \\ leq \\ text {t} \\ right) -  \\ frac {t-fmin} {fmax-fmin} \\ right)^ 2dt

Where x is the random variable you are sampling with your GetRandomVal(double fMin, double fMax) function, and 其中x随机变量 ,您使用GetRandomVal(double fMin, double fMax)函数进行采样,并且 p(x <= t means the probability that a random x is smaller or equal to t . 表示随机x小于或等于t的概率。

And now you can go on and try to evaluate eg a dabbler's answer . 现在你可以继续尝试评估一个dabbler的答案 (Hint all the answers that fail to use the whole precision and stick to eg 52 bits will fail this minimization criterion.) (提示所有未能使用整个精度的答案并坚持使用例如52位将使此最小化标准失败。)

However, if you just want to be able to generate all float bit patterns that fall into your specified range with equal possibility, even if that means that eg asking for GetRandomVal(0,1000) will create more values between 0 and 1.5 than between 1.5 and 1000, that's easy: any interval of IEEE floating point numbers when interpreted as bit patterns map easily to a very small number of intervals of unsigned int64 . 但是,如果您只是希望能够以相同的可能性生成落入指定范围的所有浮点模式,即使这意味着例如要求GetRandomVal(0,1000)将创建0到1.5之间的值多于1.5之间的值。和1000,这很容易:当解释为位模式时,任何IEEE浮点数的间隔都很容易映射到unsigned int64的非常少的间隔。 See eg this question . 参见例如这个问题 Generating equally distributed random values of unsigned int64 in any given interval is easy. 在任何给定的时间间隔内生成unsigned int64均匀分布的随机值很容易。

我可能误解了这个问题,但是什么阻止你简单地从随机比特流中采样下一个n比特并将其转换为基数为10的数字,范围为0到2 ^ n-1。

To get a random value in [0..1[ you could do something like: 要获得[0..1中的随机值[你可以做类似的事情:

double value = 0;
for (int i=0;i<53;i++)
   value = 0.5 * (value + random_bit());  // Insert 1 random bit
   // or value = ldexp(value+random_bit(),-1);
   // or group several bits into one single ldexp
return value;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM