简体   繁体   English

生成大量随机数的有效方法

[英]Efficient way to generate lots of random numbers

I have a java method that has to generate lots of random numbers in a very short period of time. 我有一个Java方法,必须在很短的时间内生成大量随机数。 My first approach was to use Math.random (which works really fast), but I have the presumption that because I call the Math.random so quick on behind the other, the "random" isn't really random (or less random) because of that (but I need it to be as random as possible). 我的第一种方法是使用Math.random(运行速度非常快),但是我有一个假设,因为我称Math.random如此之快,所以“随机”并不是真正的随机(或不太随机)因此(但我需要尽可能随机)。

I now have two questions: 我现在有两个问题:

  1. Is my presumption right, that because of the number of calls in a very short period of time the random output gets less random? 我的假设是正确的,因为在很短的时间内调用次数众多,随机输出的随机性就会降低吗? And if the answer for 1. is Yes: 并且如果1.的答案是“是”:
  2. What would be the fastest way (per call) to remove the problem with the less randomness? 用随机性减少问题的最快方法(每次通话)是什么?

I have already played around with the SecureRandom , but it is minimum 15 times slower than the normal Math.random, which is too slow for my requirements. 我已经使用过SecureRandom ,但是它至少比普通Math.random慢15倍,这对我的要求来说太慢了。

TL;DR: Your presumption is wrong. TL; DR:您的推定是错误的。

Math.random acts on a single instance on java.util.Random : Math.random作用于java.util.Random的单个实例:

Returns a double value with a positive sign, greater than or equal to 0.0 and less than 1.0. 返回一个带正号的双精度值,大于或等于0.0且小于1.0。 Returned values are chosen pseudorandomly with (approximately) uniform distribution from that range. 从该范围(近似)均匀分布伪随机地选择返回值。

When this method is first called, it creates a single new pseudorandom-number generator, exactly as if by the expression 首次调用此方法时,它会创建一个新的伪随机数生成器,就像通过表达式

new java.util.Random()

From the JavaDoc JavaDoc

Now, java.util.Random uses a linear congruential formula that is seeded with a number that is " very likely to be distinct from any other invocation of this constructor. " 1 现在, java.util.Random使用线性同余公式 ,该公式的种子编号为“ 很可能与该构造函数的任何其他调用不同。1

As this is a pseudorandom progression - ie it will give exactly the same values from the same seed - the speed at which you extract numbers from Math.random has no impact on their randomness. 由于这是伪随机级数-即它将从同一种子中获得完全相同的值-从Math.random提取数字的速度不会影响其随机性。

Random numbers using the Random class use an algorithm that bit mangles an int to give you a new int. 使用Random类的随机数使用一种位整型int为您提供新int的算法。 It will use the same algorithm regardless of how quickly or how many times you call it. 无论您调用它有多快或多少次,它都将使用相同的算法。 The progression is the progression. 进步就是进步。

To test this, seed it with a number, like 42. Then watch the progression. 要对此进行测试,请给其添加数字(例如42)。然后观察进度。 Seed it with the same number again. 再次以相同的数量播种。 Same exact progression. 完全相同的进度。

The downside to this approach is that the numbers are not TRULY random. 这种方法的缺点是数字不是真正随机的。 They're pretty random, and good enough for most things, but not perfectly random. 它们是相当随机的,并且对大多数事情都足够好,但是并不是完全随机的。

I ran the output of the Random method through the die hard battery of tests. 我通过一系列的测试运行了Random方法的输出。 It passed most of them with flying colors, one it was borderline, and one it just flat failed. 它以飞扬的色彩通过了它们中的大多数,一种是临界线,而另一种则是平坦失败。 That's the kind of random we're talking about. 这就是我们所说的随机性。

Plus, because it uses a date time stamp to seed itself, it is somewhat predictable in some circumstances. 另外,由于它使用日期时间戳进行播种,因此在某些情况下可以预知。 Picture someone that boots up and runs your task every Monday morning first thing for that week. 想像一下一个人,该人在该周的每个星期一早上启动并运行任务。 There is some predictability because it will run with a timestamp of Monday morning between 8 and 8:30. 有一些可预测性,因为它将以星期一早上8点至8:30之间的时间戳记运行。

So, Random is good enough for most operations that don't have to do with security. 因此,对于大多数与安全性无关的操作,随机性已经足够了。 Even a lot of them. 甚至很多。

SecureRandom, on the other hand, will generate truly random numbers. 另一方面,SecureRandom将生成真正的随机数。 It does this by looking at system timings and other things that vary from second to second based on a myriad of factors. 它通过查看系统时序以及其他因各种因素而每秒变化的事物来实现此目的。

The downside is that these factors only change so often in a second, so SecureRandom can only generate a finite number of random numbers in a period of time. 不利的一面是,这些因素仅在一秒钟内就会频繁变化,因此SecureRandom在一段时间内只能生成有限数量的随机数。 It does try to generate some ahead of time and cache them for use, but you can blow the cache. 它确实尝试提前生成一些内容并缓存它们以供使用,但是您可以删除缓存。

In this way, it's like my reverse osmosis water filter. 这样,就像我的反渗透滤水器。 It holds a gallon of water that it has already filtered. 它装有一加仑已经过滤的水。 If you use the whole gallon of water in one shot, then you get it at the rate it filters it--something like 1 ounce per 5 seconds or some such. 如果一次性使用一整加仑的水,那么您得到的水的过滤速度就会很高-大约每5秒1盎司或类似的水量。 The first gallon is fast, then it's really slow. 第一加仑是快的,然后真的很慢。

If you can use Java8, I recommend the java.utils.SplitableRandom . 如果可以使用Java8,则建议使用java.utils.SplitableRandom It is faster and has better statistical distribution. 它更快,统计分布更好。 In my test java.utils.SplitableRandom is 30 times faster than java.utils.Random. 在我的测试中,java.utils.SplitableRandom比java.utils.Random快30倍。

I used tobijdc answer to write this answer. 我用tobijdc答案来写这个答案。

  1. The (pseudo) random number generator produces the same results given the same initial seed values regardless of the frequency with which it is called. 给定相同的初始种子值,(伪)随机数生成器会产生相同的结果,而与调用它的频率无关。 It is entirely deterministic and independent of speed. 它完全是确定性的,与速度无关。 The selection of the seed is dependent on the time (if not explicitly specified), but not the sequence generated. 种子的选择取决于时间(如果未明确指定),而不取决于生成的序列。
  2. If you need faster speed, you can pre-compute the values of a pseudo random number sequence larger than the length you need, and then use just one call to he generator to select a starting position in the sequence. 如果需要更快的速度,则可以预先计算大于所需长度的伪随机数序列的值,然后仅使用一次调用生成器来选择序列中的起始位置。 This way, you can simply read out values after one initial call on all subsequent runs. 这样,您可以在所有后续运行的一次初始调用之后简单地读出值。 Your performance will be limited by the speed at which you can index and read the memory holding the table. 您的性能将受到索引和读取保存该表的内存的速度的限制。 Depending on your application, potential reuse of the sequence may not be advisable. 根据您的应用程序,可能不建议重用该序列。

While Random is likely good enough, you can improve on Math.random() by using a function which closer to what you need. 尽管Random可能足够好,但是您可以通过使用更接近所需功能的函数来改进Math.random() eg 例如

Random rand = new Random();

for ( loop ) {
   int dice = rand.nextInt(6) + 1;

This is much faster than using Math.random() but if you need a long 这比使用Math.random()快得多,但是如果您需要很long

long l = rand.nextLong();

In this case l has 64-bits of randomness but Math.random() has 53-bits at best (actually it has only 48-bits) 在这种情况下, l具有64位随机性,但是Math.random()最多具有53位(实际上,它只有48位)

If you need lots of numbers quickly (as for simulation or Monte Carlo integration), then a Cryptographically Secure RNG won't be fast enough. 如果您快速需要大量数字(例如用于模拟或蒙特卡洛集成),那么加密安全的RNG不够快。 java.util.Random() is fast, but a very poor quality PRNG. java.util.Random()速度很快,但是PRNG的质量很差。 What you need is a high-quality fast PRNG, like Mersenne Twister or XorShift. 您需要的是高质量的快速PRNG,例如Mersenne Twister或XorShift。 Take a look at http://xorshift.di.unimi.it/ , for example, or my own ojrandlib. 例如,看看http://xorshift.di.unimi.it/或我自己的ojrandlib。

Try the java kiss library AESPRNG internal generator. 尝试使用Java KissAESPRNG内部生成器。 It is thread safe and about twice as fast as Random when used in bulk requests, and can produce 128 bit cryptographically strong (but repeatable, if you reset the seed) pseudo-random numbers. 它是线程安全的,在批量请求中使用时,速度大约是Random的两倍,并且可以产生128位加密强度高(但如果可重设种子,则可以重复)的伪随机数。 It is based on AES CTR mode, which is highly optimized on most systems. 它基于AES CTR模式,该模式在大多数系统上都经过了高度优化。

kiss.util.AESPRNG prng = new kiss.util.AESPRNG();
double [] x = new double [1_000_000];
prng.nextDoubles(x,0,x.length);

If you want a repeatable sequence, use seed(byte[] value16) or seed(double value). 如果需要可重复的序列,请使用seed(byte [] value16)或seed(double value)。 To reset the sequence. 重设顺序。 It is a drop-in replacement for Random, but with a number of convenience methods for ranges or bulk numbers. 它是对Random的直接替代,但是具有许多用于范围或批量数的便捷方法。 It is really just better than any of the suggested alternatives: fast, repeatable, and 128-bit strongly random. 它确实比任何建议的替代方案都好:快速,可重复和128位强随机性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM