简体   繁体   English

在Node.js包含范围之间生成大随机数

[英]Generating large random numbers between an inclusive range in Node.js

So I'm very familiar with the good old 所以我对好老很熟悉

Math.floor(Math.random() * (max - min + 1)) + min;

and this works very nicely with small numbers, however when numbers get larger this quickly becomes biased and only returns numbers one zero below it (for ex. a random number between 0 and 1e100 will almost always (every time I've tested, so several billion times since I used a for loop to generate lots of numbers) return [x]e99 ). 并且对于较小的数字来说效果很好,但是当数字变大时,它很快就会产生偏差,并且仅返回数字零以下的数字(例如, 01e100之间的随机数几乎总是(每次测试时,自从我使用for循环生成大量数字以来经过十亿次)return [x]e99 )。 And yes I waited the long time for the program to generate that many numbers, twice. 是的,我等了很长时间,程序才生成两次这么多的数字。 By this point, it would be safe to assume that the output is always [x]e99 for all practical uses. 至此,可以肯定的是,对于所有实际用途,输出始终为[x]e99

So next I tried this 所以接下来我尝试了这个

Math.floor(Math.pow(max - min + 1, Math.random())) + min;

and while that works perfectly for huge ranges it breaks for small ones. 虽然它在很大的范围内都能正常工作,但在较小的范围内会中断。 So my question is how can do both - be able to generate both small and large random numbers without any bias (or minimal bias to the point of not being noticeable)? 所以我的问题是怎么做-能够生成大小随机数而又没有任何偏差(或最小偏差到不引起注意的程度)?

Note: I'm using Decimal.js to handle numbers in the range -1e2043 < x < 1e2043 but since it is the same algorithm I displayed the vanilla JavaScript forms above to prevent confusion. 注意:我正在使用Decimal.js处理-1e2043 <x < 1e2043范围内的数字,但是由于它是相同的算法,因此我在上面显示了普通的JavaScript形式以防止混淆。 I can take a vanilla answer and convert it to Decimal.js without any trouble so feel free to answer with either. 我可以采用简单的答案并将其转换为Decimal.js,而不会遇到任何麻烦,因此随时可以使用任一答案。

Note #2: I want to even out the odds of getting large numbers. 注意#2:我想平分获得大数字的几率。 For example 1e33 should have the same odds as 1e90 in my 0-1e100 example. 例如1e33应该有相同的赔率为1e900-1e100例子。 But at the same time I need to support smaller numbers and ranges. 但是同时我需要支持较小的数字和范围。

Your Problem is Precision. 您的问题是精度。 That's the reason you use Decimal.js in the first place. 这就是您首先使用Decimal.js的原因。 Like every other Number in JS, Math.random() supports only 53 bit of precision (Some browser even used to create only the upper 32bit of randomness) . 像JS中的其他Number一样, Math.random()仅支持53位精度(某些浏览器甚至只用于创建高32位的随机性) But your value 1e100 would need 333 bit of precision. 但是您的值1e100需要333位精度。 So the lower 280 bit (~75 decimal places out of 100) are discarded in your formula. 因此,较低的280位(〜100的小数点后75位)在您的公式中被丢弃。

But Decimal.js provides a random() method. 但是Decimal.js提供了random()方法。 Why don't you use that one? 你为什么不使用那个?

function random(min, max){
    var delta = new Decimal(max).sub(min);
    return Decimal.random( +delta.log(10) ).mul(delta).add(min);
}

Another "problem" why you get so many values with e+99 is probability. 为什么用e+99获得这么多值的另一个“问题”是概率。 For the range 0 .. 1e100 the probabilities to get some exponent are 对于范围0 .. 1e100 ,获得某些指数的概率为

e+99  => 90%, 
e+98  =>  9%,
e+97  =>  0.9%,
e+96  =>  0.09%,
e+95  =>  0.009%,
e+94  =>  0.0009%,
e+93  =>  0.00009%,
e+92  =>  0.000009%,
e+91  =>  0.0000009%,
e+90  =>  0.00000009%,
and so on

So if you generate ten billion numbers, statistically you'll get a single value up to 1e+90 . 因此,如果生成100亿个数字,则从统计角度上讲,您将获得高达1e+90的单个值。 That are the odds. 那是赔率。

I want to even out those odds for large numbers. 我想平分那些赔率。 1e33 should have the same odds as 1e90 for example 例如,1e33的赔率应与1e90的赔率相同

OK, then let's generate a 10 random in the range min ... max . 好的,接下来让我们在min ... max范围内生成10个随机 min ... max

function random2(min, max){
    var a = +Decimal.log10(min), 
        b = +Decimal.log10(max);
    //trying to deal with zero-values. 
    if(a === -Infinity && b === -Infinity) return 0;  //a random value between 0 and 0 ;)
    if(a === -Infinity) a = Math.min(0, b-53);
    if(b === -Infinity) b = Math.min(0, a-53);

    return Decimal.pow(10, Decimal.random(Math.abs(b-a)).mul(b-a).add(a) );
}

now the exponents are pretty much uniformly distributed, but the values are a bit skewed. 现在,指数几乎是均匀分布的,但是值有些偏斜。 Because 10 1 to 10 1.5 10 .. 33 has the same probability as 10 1.5 to 10 2 34 .. 100 因为10 1至10 1.5 10 .. 33与10 1.5至10 2 34 .. 100具有相同的概率

The issue with Math.random() * Math.pow(10, Math.floor(Math.random() * 100)); Math.random() * Math.pow(10, Math.floor(Math.random() * 100)); at smaller numbers is that random ranges [0, 1) , meaning that when calculating the exponent separately one needs to make sure the prefix ranges [1, 10) . 较小的数字是随机范围[0, 1) ,这意味着当单独计算指数时,需要确保前缀范围[1, 10) Otherwise you want to calculate a number in [1eX, 1eX+1) but have eg 0.1 as prefix and end up in 1eX-1 . 否则,您要在[1eX, 1eX+1)计算一个数字,但要以例如0.1作为前缀并以1eX-1结尾。 Here is an example, maxExp is not 100 but 10 for readability of the output but easily adjustable. 这是一个示例,对于输出的可读性, maxExp不是100,而是10,但是很容易调整。

 let maxExp = 10; function differentDistributionRandom() { let exp = Math.floor(Math.random() * (maxExp + 1)) - 1; if (exp < 0) return Math.random(); else return (Math.random() * 9 + 1) * Math.pow(10, exp); } let counts = new Array(maxExp + 1).fill(0).map(e => []); for (let i = 0; i < (maxExp + 1) * 1000; i++) { let x = differentDistributionRandom(); counts[Math.max(0, Math.floor(Math.log10(x)) + 1)].push(x); } counts.forEach((e, i) => { console.log(`E: ${i - 1 < 0 ? "<0" : i - 1}, amount: ${e.length}, example: ${Number.isNaN(e[0]) ? "none" : e[0]}`); }); 

You might see the category <0 here which is hopefully what you wanted (the cutoff point is arbitrary, here [0, 1) has the same probability as [1, 10) as [10, 100) and so on, but [0.01, 0.1) is again less likely than [0.1, 1) ) 您可能会在这里看到类别<0 ,这是希望的类别(截止点是任意的,此处[0, 1) [1, 10)[10, 100)概率与[1, 10) 1,10 [0, 1)的概率相同,以此类推,等等,但是[0.01, 0.1)可能性再小于[0.1, 1)

If you didn't insist on base 10 you could reinterpret the pseudorandom bits from two Math.random calls as Float64 which would give a similar distribution, base 2 : 如果您不坚持以base 10Float64 ,则可以将两个Math.random调用中的伪随机位重新解释为Float64 ,这将给出以base 2Float64的类似分布:

 function exponentDistribution() { let bits = [Math.random(), Math.random()]; let buffer = new ArrayBuffer(24); let view = new DataView(buffer); view.setFloat64(8, bits[0]); view.setFloat64(16, bits[1]); //alternatively all at once with setInt32 for (let i = 0; i < 4; i++) { view.setInt8(i, view.getInt8(12 + i)); view.setInt8(i + 4, view.getInt8(20 + i)); } return Math.abs(view.getFloat64(0)); } let counts = new Array(11).fill(0).map(e => []); for (let i = 0; i < (1 << 11) * 100; i++) { let x = exponentDistribution(); let exp = Math.floor(Math.log2(x)); if (exp >= -5 && exp <= 5) { counts[exp + 5].push(x); } } counts.forEach((e, i) => { console.log(`E: ${i - 5}, amount: ${e.length}, example: ${Number.isNaN(e[0]) ? "none" : e[0]}`); }); 

This one obviously is bounded by the precision ends of Float64 , there are some uneven parts of the distribution due to some details of IEEE754, eg denorms/subnorms and i did not take care of special values like Infinity . 显然,这是由Float64的精度端Float64 ,由于IEEE754的某些细节,分布中存在一些不平衡的部分,例如,范数/子范数,并且我没有处理诸如Infinity之类的特殊值。 It is rather to be seen as a fun extra, a reminder of the distribution of float values. 而是将其视为一种有趣的额外功能,它提醒了float值的分布。 Note that the loop does 1 << 11 (2048) times a number iterations, which is about the exponent range of Float64 , 11 bit, [-1022, 1023] . 请注意,循环执行1 << 11 (2048)次迭代,这大约是Float64[-1022, 1023] Float64 [-1022, 1023]的指数范围。 That's why in the example each bucket gets approximately said number (100) hits. 这就是为什么在此示例中,每个存储桶都获得约100个点击的结果。

You can create the number in increments less than Number.MAX_SAFE_INTEGER , then concatenate the generated numbers to a single string 您可以以小于Number.MAX_SAFE_INTEGER增量创建数字,然后将生成的数字连接到单个字符串

 const r = () => Math.floor(Math.random() * Number.MAX_SAFE_INTEGER); let N = ""; for (let i = 0; i < 10; i++) N += r(); document.body.appendChild(document.createTextNode(N)); console.log(/e/.test(N)); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM