简体   繁体   中英

How many random numbers can std::uniform_real_distribution generate before losing randomness?

I am writing a c++ code for a Monte Carlo simulation. As such, I need to generate many numbers uniformly distributed between [0,1). I included the following code taken from here to generate my numbers:

// uniform_real_distribution
#include <iostream>
#include <random>

std::default_random_engine generator;
std::uniform_real_distribution<double> distribution(0.0,1.0);

int main()
{  
    double number = distribution(generator); //rnd number uniformly distributed between [0,1)
    return 0;
}

So every time I need a new number, I just call distribution(generator) . I run my Monte Carlo simulation to get many sample results. The results should be normally distributed around the real mean (that is unknown). When I run a chi-square goodness-of-fit test to check if they are normally distributed, my sample results do not pass the test sometimes. The key word here is "sometimes", so this made me think that I called distribution(generator) too many times and in the end I lost randomness of the generated numbers. I am talking about 10^11 numbers generated in each simulation.

Could it be possible? What if I reset the distribution with distribution.reset() before I call it? Would that solve my problem?

Thanks for any suggestion you may have.

If a random number generator doesn't fail a test sometimes, then the test is too weak. For example, if a test has a 99% degree of confidence, a perfect random number generator should be expected to fail it about 1% of the time.

For example, consider a perfectly fair coin. If you flip it 1,000 times, you will get on average 500 heads. If you want to use this as a test for randomness, you compute the range of values that a fair coin will fall within some percentage of the time. Then you make sure your random number generator doesn't fail the test more often than expected.

Your testing methodology -- expecting a random number generator to pass every test every time -- only works if your tests are very weak. That would allow poor random number generators to pass too often and is not a good testing methodology.

True story: A random number generator that I implemented was rigorously tested by an independent testing lab . They subjected it to 100 tests, each using millions of samples and testing for various properties. Each test had a 99% degree of confidence. The RNG failed 3 tests, which was within the expected range, and so passed the testing portion of the certification. That an RNG passes these extremely rigorous tests the vast majority of the time demonstrates that it's a very, very good RNG, perhaps perfect. It's hard to write a broken RNG that passes any of these tests ever.

You need to compute the probability that a perfect RNG will fail your test and then see if your RNG shows a failure rate close to that expected.

The random generator algorithm used by STL is not specified in the standard, so you cannot know for sure how long the random sequence is without knowing what random number generation algorithm has been used.

Its probably one of a small set of know to be good and fast generators like Mersenne twister or CMWC.

There are many ways that random number generators are rated, but in your question I think you want to know the period - how long until the numbers repeat. The period will also depend on the initial conditions.

A good standard CMWC generator, CMWC 4096 has a period of 2^131104. A standard Mersenne generator, MT19937 has a period of 2^19937.

But all bets are off if the STL implementation you are using uses a poorly chosen algorithm.

Reseeding before every call, or even frequently will, especially if the seeds are not well chosen, will ruin the statistical properties of the generator. You're usually best off just seeding it once, and calling it from there.

Note that the strongness of your random sequence depends on the generator, not on the distribution.

About the default_random_engine the reference says " a generator that provides at least acceptable engine behavior for relatively casual, inexpert, and/or lightweight use "... probably not what you want.

As suggested, you can replace it with a std::mt19937 , I am not an expert so I don't know how long you can use it before losing randomness.

To renew the randomness of your generator, you can use a std::random_device and use it to seed() from time to time the generator. On some implementation (you'll have to check) the random_device uses even special instructions of the CPU to generate "hard" random numbers as seeds. Alas, you cannot simply reseed every time because such hardware generation is quite slow.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM