简体   繁体   English

在两个值之间获得n个不同的随机数,这两个值的总和等于给定的数字

[英]Get n distinct random numbers between two values whose sum is equal to a given number

I would like to find distinct random numbers within a range that sums up to given number. 我想在一个范围内找到不同的随机数,总和达到给定的数字。

Note: I found similar questions in stackoverflow, however they do not address exactly this problem (ie they do not consider a negative lowerLimit for the range). 注意:我在stackoverflow中发现了类似的问题,但是它们并没有解决这个问题(即它们不考虑范围的负下限)。

If I wanted that the sum of my random number was equal to 1 I just generate the required random numbers, compute the sum and divided each of them by the sum; 如果我想要随机数的总和等于1,我只需生成所需的随机数,计算总和并将它们除以总和; however here I need something a bit different; 但是在这里我需要一些不同的东西; I will need my random numbers to add up to something different than 1 and still my random numbers must be within a given range. 我需要我的随机数加起来不同于1而且我的随机数必须在给定范围内。

Example: I need 30 distinct random numbers (non integers) between -50 and 50 where the sum of the 30 generated numbers must be equal to 300; 示例:我需要在-50和50之间的30个不同的随机数(非整数),其中30个生成的数字的总和必须等于300; I wrote the code below, however it will not work when n is much larger than the range (upperLimit - lowerLimit), the function could return numbers outside the range [lowerLimit - upperLimit]. 我编写了下面的代码,但是当n远大于范围(upperLimit - lowerLimit)时它不起作用,函数可以返回范围[lowerLimit - upperLimit]之外的数字。 Any help to improve the current solution? 有什么帮助来改善当前的解决方案?

static void Main(string[] args)
{
    var listWeights = GetRandomNumbersWithConstraints(30, 50, -50, 300);
}

private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, int sum)
{
    if (upperLimit <= lowerLimit || n < 1)
        throw new ArgumentOutOfRangeException();

    Random rand = new Random(Guid.NewGuid().GetHashCode());
    List<double> weight = new List<double>();

    for (int k = 0; k < n; k++)
    {
        //multiply by rand.NextDouble() to avoid duplicates
        double temp = (double)rand.Next(lowerLimit, upperLimit) * rand.NextDouble();

        if (weight.Contains(temp))
            k--;
        else
            weight.Add(temp);
    }

    //divide each element by the sum
    weight = weight.ConvertAll<double>(x => x / weight.Sum());  //here the sum of my weight will be 1 

    return weight.ConvertAll<double>(x => x * sum);
}

EDIT - to clarify 编辑 - 澄清

Running the current code will generate the following 30 numbers that add up to 300. However those numbers are not within -50 and 50 运行当前代码将生成以下30个数字,最多可添加300个。但这些数字不在-50和50之间

-4.425315699
67.70219958
82.08592061
46.54014109
71.20352208
-9.554070146
37.65032717
-75.77280868
24.68786878
30.89874589
142.0796933
-1.964407284
9.831226893
-15.21652248
6.479463312
49.61283063
118.1853036
-28.35462683
49.82661159
-65.82706541
-29.6865969
-54.5134262
-56.04708803
-84.63783048
-3.18402453
-13.97935982
-44.54265204
112.774348
-2.911427266
-58.94098071

Ok, here how it could be done 好的,这是怎么做的

We will use Dirichlet Distribution , which is distribution for random numbers x i in the range [0...1] such that 我们将使用Dirichlet分布 ,它是[0 ... 1]范围内的随机数x i的分布

Sum i x i = 1 Sum i x i = 1

So, after linear rescaling condition for sum would be satisfied automatically. 因此,在自动满足求和的线性重新缩放条件之后。 Dirichlet distribution is parametrized by α i , but we assume all RN to be from the same marginal distribution, so there is only one parameter α for each and every index. Dirichlet分布由αi参数化,但我们假设所有RN来自相同的边际分布,因此每个索引只有一个参数α。

For reasonable large value of α, mean value of sampled random numbers would be =1/n, and variance ~1/(n * α), so larger α lead to random value more close to the mean. 对于合理的α大值,采样随机数的平均值为= 1 / n,方差为~1 /(n *α),因此较大的α导致随机值更接近均值。

Ok, now back to rescaling, 好的,现在回到重新调整,

v i = A + B*x i v i = A + B * x i

And we have to get A and B . 我们必须得到AB As @HansKesting rightfully noted, with only two free parameters we could satisfy only two constraints, but you have three. 正如@HansKe正确指出的那样,只有两个自由参数,我们只能满足两个约束,但你有三个。 So we would strictly satisfy low bound constraint, sum value constraint, but occasionally violate upper bound constraint. 因此我们将严格满足下界约束,和值约束,但偶尔会违反上限约束。 In such case we just throw whole sample away and do another one. 在这种情况下,我们只是扔掉整个样品并做另一个。

Again, we have a knob to turn, α getting larger means we are close to mean values and less likely to hit upper bound. 同样,我们有一个转动旋钮,α越大意味着我们接近平均值并且不太可能达到上限。 With α = 1 I'm rarely getting any good sample, but with α = 10 I'm getting close to 40% of good samples. 当α= 1时,我很少得到任何好的样本,但是当α= 10时,我接近40%的好样本。 With α = 16 I'm getting close to 80% of good samples. 随着α= 16,我接近80%的良好样本。

Dirichlet sampling is done via Gamma distribution, using code from MathDotNet . Dirichlet采样通过Gamma分布完成,使用MathDotNet中的代码。

Code, tested with .NET Core 2.1 代码,使用.NET Core 2.1进行测试

using System;

using MathNet.Numerics.Distributions;
using MathNet.Numerics.Random;

class Program
{
    static void SampleDirichlet(double alpha, double[] rn)
    {
        if (rn == null)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is null");

        if (alpha <= 0.0)
            throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");

        int n = rn.Length;
        if (n == 0)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");

        var gamma = new Gamma(alpha, 1.0);

        double sum = 0.0;
        for(int k = 0; k != n; ++k) {
            double v = gamma.Sample();
            sum  += v;
            rn[k] = v;
        }

        if (sum <= 0.0)
            throw new ApplicationException($"SampleDirichlet:: sum {sum} is non-positive");

        // normalize
        sum = 1.0 / sum;
        for(int k = 0; k != n; ++k) {
            rn[k] *= sum;
        }
    }

    static bool SampleBoundedDirichlet(double alpha, double sum, double lo, double hi, double[] rn)
    {
        if (rn == null)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is null");

        if (alpha <= 0.0)
            throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");

        if (lo >= hi)
            throw new ArgumentException($"SampleDirichlet:: low {lo} is larger than high {hi}");

        int n = rn.Length;
        if (n == 0)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");

        double mean = sum / (double)n;
        if (mean < lo || mean > hi)
            throw new ArgumentException($"SampleDirichlet:: mean value {mean} is not within [{lo}...{hi}] range");

        SampleDirichlet(alpha, rn);

        bool rc = true;
        for(int k = 0; k != n; ++k) {
            double v = lo + (mean - lo)*(double)n * rn[k];
            if (v > hi)
                rc = false;
            rn[k] = v;
        }
        return rc;
    }

    static void Main(string[] args)
    {
        double[] rn = new double [30];

        double lo = -50.0;
        double hi =  50.0;

        double alpha = 10.0;

        double sum = 300.0;

        for(int k = 0; k != 1_000; ++k) {
            var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
            Console.WriteLine($"Rng(BD), v = {q}");
            double s = 0.0;
            foreach(var r in rn) {
                Console.WriteLine($"Rng(BD),     r = {r}");
                s += r;
            }
            Console.WriteLine($"Rng(BD),    summa = {s}");
        }
    }
}

UPDATE UPDATE

Usually, when people ask such question, there is an implicit assumption/requirement - all random numbers shall be distribution in the same way. 通常,当人们提出这样的问题时,存在隐含的假设/要求 - 所有随机数应以相同的方式分配。 It means that if I draw marginal probability density function (PDF) for item indexed 0 from the sampled array, I shall get the same distribution as I draw marginal probability density function for the last item in the array. 这意味着如果我从采样数组中为索引为0的项绘制边际概率密度函数(PDF),我将获得与绘制数组中最后一项的边际概率密度函数相同的分布。 People usually sample random arrays to pass it down to other routines to do some interesting stuff. 人们通常会对随机数组进行采样,将其传递给其他例程来做一些有趣的事情。 If marginal PDF for item 0 is different from marginal PDF for last indexed item, then just reverting array will produce wildly different result with the code which uses such random values. 如果项目0的边际PDF与最后一个索引项目的边缘PDF不同,那么只需恢复数组将使用使用此类随机值的代码产生截然不同的结果。

Here I plotted distributions of random numbers for item 0 and last item (#29) for original conditions([-50...50] sum=300), using my sampling routine. 在这里,我使用我的采样例程绘制了项目0和最后一项(#29)的原始条件([ - 50 ... 50] sum = 300)的随机数分布。 Look similar, isn't it? 看起来很相似,不是吗?

在此输入图像描述

Ok, here is a picture from your sampling routine, same original conditions([-50...50] sum=300), same number of samples 好的,这是您的采样程序中的图片,相同的原始条件([ - 50 ... 50] sum = 300),相同数量的样本

在此输入图像描述

UPDATE II 更新II

User supposed to check return value of the sampling routine and accept and use sampled array if (and only if) return value is true. 用户应该检查采样例程的返回值,并且如果(并且仅当)返回值为真,则接受并使用采样数组。 This is acceptance/rejection method. 这是接受/拒绝方法。 As an illustration, below is code used to histogram samples: 作为说明,下面是用于直方图样本的代码:

        int[] hh = new int[100]; // histogram allocated

        var s = 1.0; // step size
        int k = 0;   // good samples counter
        for( ;; ) {
            var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
            if (q) // good sample, accept it
            {
                var v = rn[0]; // any index, 0 or 29 or ....
                var i = (int)((v - lo) / s);
                i = System.Math.Max(i, 0);
                i = System.Math.Min(i, hh.Length-1);
                hh[i] += 1;

                ++k;
                if (k == 100000) // required number of good samples reached
                    break;
            }
        }
        for(k = 0; k != hh.Length; ++k)
        {
            var x = lo + (double)k * s + 0.5*s;
            var v = hh[k];
            Console.WriteLine($"{x}     {v}");
        }

Here you go. 干得好。 It'll probably run for centuries before actually returning the list, but it'll comply :) 它可能会在实际返回列表之前运行几个世纪,但它会遵守:)

    public List<double> TheThing(int qty, double lowest, double highest, double sumto)
    {
        if (highest * qty < sumto)
        {
            throw new Exception("Impossibru!");
            // heresy
            highest = sumto / 1 + (qty * 2);
            lowest = -highest;
        }
        double rangesize = (highest - lowest);
        Random r = new Random();
        List<double> ret = new List<double>();

        while (ret.Sum() != sumto)
        {
            if (ret.Count > 0)
                ret.RemoveAt(0);
            while (ret.Count < qty)
                ret.Add((r.NextDouble() * rangesize) + lowest);
        }

        return ret;
    }

I come up with this solution which is fast. 我想出了这个快速的解决方案。 I am sure it couldbe improved, but for the moment it does the job. 我相信它可以改进,但目前它完成了这项工作。

n = the number of random numbers that I will need to find n =我需要找到的随机数的数量

Constraints 约束

  • the n random numbers must add up to finalSum the n random numbers n个随机数必须加起来为finalSum n个随机数

  • the n random numbers must be within lowerLimit and upperLimit n个随机数必须在LOWERLIMITUPPERLIMIT

The idea is to remove from the initial list (that sums up to finalSum ) of random numbers the numbers outside the range [ lowerLimit , upperLimit ]. 这样做是为了从初始列表[LOWERLIMIT,UPPERLIMIT]的范围之外除去(即总计为finalSum)随机数的数字。

Then count the number left of the list (called nValid ) and their sum (called sumOfValid ). 然后计算列表左边的数字(称为nValid )及其总和(称为sumOfValid )。 Now, iteratively search for ( n-nValid ) random numbers within the range [ lowerLimit , upperLimit ] whose sum is ( finalSum-sumOfValid ) 现在,迭代的范围内搜索第(n-NVALID)的随机数[LOWERLIMIT,UPPERLIMIT]其总和(finalSum-sumOfValid)

I tested it with several combinations for the inputs variables (including negative sum) and the results looks good. 我用输入变量的几种组合测试了它(包括负和),结果看起来很好。

static void Main(string[] args)
{
    int n = 100;
    int max = 5000;
    int min = -500000;
    double finalSum = -1000;

    for (int i = 0; i < 5000; i++)
    {
        var listWeights = GetRandomNumbersWithConstraints(n, max, min, finalSum);

        Console.WriteLine("=============");
        Console.WriteLine("sum   = " + listWeights.Sum());
        Console.WriteLine("max   = " + listWeights.Max());
        Console.WriteLine("min   = " + listWeights.Min());
        Console.WriteLine("count = " + listWeights.Count());
    }
}

private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, double finalSum, int precision = 6)
{
    if (upperLimit <= lowerLimit || n < 1) //todo improve here
        throw new ArgumentOutOfRangeException();

    Random rand = new Random(Guid.NewGuid().GetHashCode());

    List<double> randomNumbers = new List<double>();

    int adj = (int)Math.Pow(10, precision);

    bool flag = true;
    List<double> weights = new List<double>();
    while (flag)
    {
        foreach (var d in randomNumbers.Where(x => x <= upperLimit && x >= lowerLimit).ToList())
        {
            if (!weights.Contains(d))  //only distinct
                weights.Add(d);
        }

        if (weights.Count() == n && weights.Max() <= upperLimit && weights.Min() >= lowerLimit && Math.Round(weights.Sum(), precision) == finalSum)
            return weights;

        /* worst case - if the largest sum of the missing elements (ie we still need to find 3 elements, 
         * then the largest sum is 3*upperlimit) is smaller than (finalSum - sumOfValid)
         */
        if (((n - weights.Count()) * upperLimit < (finalSum - weights.Sum())) ||
            ((n - weights.Count()) * lowerLimit > (finalSum - weights.Sum())))
        {
            weights = weights.Where(x => x != weights.Max()).ToList();
            weights = weights.Where(x => x != weights.Min()).ToList();
        }

        int nValid = weights.Count();
        double sumOfValid = weights.Sum();

        int numberToSearch = n - nValid;
        double sum = finalSum - sumOfValid;

        double j = finalSum - weights.Sum();
        if (numberToSearch == 1 && (j <= upperLimit || j >= lowerLimit))
        {
            weights.Add(finalSum - weights.Sum());
        }
        else
        {
            randomNumbers.Clear();
            int min = lowerLimit;
            int max = upperLimit;
            for (int k = 0; k < numberToSearch; k++)
            {
                randomNumbers.Add((double)rand.Next(min * adj, max * adj) / adj);
            }

            if (sum != 0 && randomNumbers.Sum() != 0)
                randomNumbers = randomNumbers.ConvertAll<double>(x => x * sum / randomNumbers.Sum());
        }
    }

    return randomNumbers;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM