简体   繁体   English

正态分布的随机整数?

[英]Normally distributed random integers?

Is there a nice way to get randomly generated integers with normal distribution? 是否有一种很好的方法来获得具有正态分布的随机生成的整数?

The first method which come to my mind: 我想到的第一种方法:

int rndi = (int)Math.floor(random.nextGaussian()*std);

Is there a better way? 有没有更好的办法?

Strictly speaking, you can't have normally distributed integers. 严格来说,你不能拥有正态分布的整数。 Maybe what you want is the output of a normal distribution sorted into buckets. 也许你想要的是分配到桶中的正态分布的输出。 In that case, you probably want to shift and scale your normal distribution according to the size of your array. 在这种情况下,您可能希望根据数组的大小来移动和缩放正态分布。 If you just take samples from a standard normal distribution (mean = 0 and scale = 1), you'll get samples between -2 and 2 around 99% of the time. 如果您只是从标准正态分布(均值= 0和比例= 1)中取样,您将在99%的时间内得到介于-2和2之间的样本。

Suppose you want random samples from an array of size N. You want the entries in the middle to be chosen more often than the samples at the end, but you want the samples near the ends to come up occasionally, say 1% of the time. 假设您需要来自大小为N的数组的随机样本。您希望中间的条目比最后的样本更频繁地选择,但您希望偶数附近的样本偶尔出现,比如1%的时间。 Then you may want to compute something like N/2 + N*z/4 where z is your standard normal then cast those numbers to an integer. 然后你可能想要计算N / 2 + N * z / 4之类的东西,其中z是你的标准法线,然后将这些数字转换为整数。 If you do this, you'll occasionally get an index outside your array. 如果你这样做,你偶尔会在阵列外得到一个索引。 Just test for that and get a new value when that happens. 只需测试它,并在发生这种情况时获得新值。

You should update the question to make clear what's exactly your use case. 您应该更新问题以明确您的用例是什么。

According to your comment, you shouldn't be using normal distribution at all. 根据你的评论,你根本不应该使用正态分布。 Instead try one of many discrete distributions, since you want integers at the end. 而是尝试许多离散分布中的一个,因为你最后想要整数。 There are a lot of those, but I'd recommend one - very simple. 有很多这些,但我推荐一个 - 非常简单。 It uses stochastic vector as the discrete probability distribution. 它使用随机向量作为离散概率分布。

Here's example implementation: 这是一个示例实现:

public class DiscreteRandom {

    private final double[] probDist;

    public DiscreteRandom(double... probs) {
        this.probDist = makeDistribution(probs);
    }

    private double[] makeDistribution(double[] probs) {
        double[] distribution = new double[probs.length];
        double sum = 0;
        for (int i = 0; i < probs.length; i++) {
            sum += probs[i];
            distribution[i] = sum;
        }
        return distribution;
    }

    public int nextInt() {
        double rand = Math.random();
        int i = 0;
        while (rand > probDist[i]) i++;
        return i;
    }

    /**
     * Simple test
     */
    public static void main(String[] args) {
        // We want 0 to come 3 times more often than 1.
        // The implementation requires normalized probability
        // distribution thus testProbs elements sum up to 1.0d.
        double[] testProbs = {0.75d, 0.25d};
        DiscreteRandom randGen = new DiscreteRandom(testProbs);

        // Loop 1000 times, we expect:
        // sum0 ~ 750
        // sum1 ~ 250
        int sum0 = 0, sum1 = 0, rand;
        for (int i = 0; i < 1000; i++) {
            rand = randGen.nextInt();
            if (rand == 0) sum0++;
            else           sum1++;
        }
        System.out.println("sum0 = " + sum0 + "sum1 = " + sum1);
    }
}

That depends on what you are trying to do with those random numbers. 这取决于你试图用这些随机数做什么。

The java.util.Random has some flaws. java.util.Random有一些缺陷。 As stated in JavaDoc, the nextGaussian() method uses Box Muller Transform. 如JavaDoc中所述, nextGaussian()方法使用Box Muller变换。 It depends on Random.nextDouble() which is implemented using Linear Congruential generator. 它取决于使用线性同Random.nextDouble()器实现的Random.nextDouble() And the implementation is not the best one, as stated in a bugfix proposal: 并且实现不是最好的,如错误修正提议中所述:

Sun's method uses a 48 bit seed and (as far as the bottom bit is concerned) only accesses 17 bits of this - producing extremely severe non-randomness . Sun的方法使用48位种子(就底部位而言)仅访问17位 - 产生极其严重的非随机性

So if you are interested in high statistical quality you should really avoid Sun's implementation. 因此,如果您对高统计质量感兴趣,那么您应该真正避免Sun的实施。 Take a look at this "Not so random" applet for visual proof of how bad it is. 看看这个“不那么随机”的小程序,用于视觉证明它有多糟糕。

If statistical quality is a concern to You, the best you can do is use some external PRNG library. 如果统计质量是您关注的问题,那么您可以做的最好的事情就是使用一些外部PRNG库。

You can precompute a list of "random" integers, then hand tweak that list to get the distribution you want. 您可以预先计算“随机”整数列表,然后手动调整该列表以获得所需的分布。

Then when you want a "random" number, just pull the next available one from the list... 然后,当你想要一个“随机”数字时,只需从列表中提取下一个可用数字......

This way you ensure the distribution and therefore the probability of a particular item being selected. 这样,您可以确保分布,从而确保选择特定项目的概率。 For fun, you can just "mix up" your list whenever you need. 为了好玩,您可以随时“混淆”您的列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM