简体   繁体   English

分布式概率随机数生成器

[英]Distributed probability random number generator

I want to generate a number based on a distributed probability.我想根据分布概率生成一个数字。 For example, just say there are the following occurences of each numbers:例如,假设每个数字出现以下情况:

Number| Count           
1    |  150                
2    |  40          
3    |  15          
4    |  3  

with a total of (150+40+15+3) = 208     
then the probability of a 1 is 150/208= 0.72    
and the probability of a 2 is 40/208 = 0.192    

How do I make a random number generator that returns be numbers based on this probability distribution?如何制作基于此概率分布返回数字的随机数生成器?

I'm happy for this to be based on a static, hardcoded set for now but I eventually want it to derive the probability distribution from a database query.我很高兴目前基于静态的硬编码集,但我最终希望它从数据库查询中导出概率分布。

I've seen similar examples like this one but they are not very generic.我见过像类似的例子这一个,但他们都不是很普通的。 Any suggestions?有什么建议?

The general approach is to feed uniformly distributed random numbers from 0..1 interval into the inverse of the cumulative distribution function of your desired distribution.一般方法是将均匀分布的随机数从 0..1 区间输入到所需分布的累积分布函数的倒数中

Thus in your case, just draw a random number x from 0..1 (for example with Random.NextDouble() ) and based on its value return因此,在您的情况下,只需从 0..1 中抽取一个随机数 x (例如使用Random.NextDouble() )并基于其值返回

  • 1 if 0 <= x < 150/208, 1 如果 0 <= x < 150/208,
  • 2 if 150/208 <= x < 190/208, 2 如果 150/208 <= x < 190/208,
  • 3 if 190/208 <= x < 205/208 and 3 如果 190/208 <= x < 205/208 并且
  • 4 otherwise. 4 否则。

Do this only once:只做一次:

  • Write a function that calculates a cdf array given a pdf array.编写一个函数,计算给定 pdf 数组的 cdf 数组。 In your example pdf array is [150,40,15,3], cdf array will be [150,190,205,208].在您的示例中,pdf 数组为 [150,40,15,3],cdf 数组为 [150,190,205,208]。

Do this every time:每次都这样做:

  • Get a random number in [0,1) , multiply with 208, truncate up (or down: I leave it to you to think about the corner cases) You'll have an integer in 1..208.在 [0,1) 中获取一个随机数,乘以 208,向上截断(或向下截断:我留给您考虑极端情况)您将得到 1..208 中的整数。 Name it r.将其命名为 r。
  • Perform a binary search on cdf array for r.对 r 的 cdf 数组执行二分搜索 Return the index of the cell that contains r.返回包含 r 的单元格的索引。

The running time will be proportional to log of the size of the given pdf array.运行时间将与给定 pdf 数组大小的 log 成正比。 Which is good.哪个好。 However, if your array size will always be so small (4 in your example) then performing a linear search is easier and also will perform better.但是,如果您的数组大小总是很小(在您的示例中为 4),那么执行线性搜索会更容易并且性能也会更好。

There are many ways to generate a random integer with a custom distribution (also known as a discrete distribution ).有很多方法可以生成具有自定义分布(也称为离散分布)的随机整数。 The choice depends on many things, including the number of integers to choose from, the shape of the distribution, and whether the distribution will change over time.选择取决于许多因素,包括可供选择的整数数量、分布的形状以及分布是否会随时间变化。

One of the simplest ways to choose an integer with a custom weight function f(x) is the rejection sampling method.使用自定义权重函数f(x)选择整数的最简单方法之一是拒绝采样方法。 The following assumes that the highest possible value of f is max .下面假设fmax可能值是max The time complexity for rejection sampling is constant on average, but depends greatly on the shape of the distribution and has a worst case of running forever.拒绝采样的时间复杂度平均是恒定的,但在很大程度上取决于分布的形状,最坏的情况是永远运行。 To choose an integer in [1, k ] using rejection sampling:要使用拒绝采样在 [1, k ] 中选择一个整数:

  1. Choose a uniform random integer i in [1, k ].在 [1, k ] 中选择一个统一的随机整数i
  2. With probability f(i)/max , return i .以概率f(i)/max ,返回i Otherwise, go to step 1.否则,转到步骤 1。

Other algorithms have an average sampling time that doesn't depend so greatly on the distribution (usually either constant or logarithmic), but often require you to precalculate the weights in a setup step and store them in a data structure.其他算法的平均采样时间不太依赖于分布(通常是常数或对数),但通常需要您在设置步骤中预先计算权重并将它们存储在数据结构中。 Some of them are also economical in terms of the number of random bits they use on average.其中一些在平均使用的随机位数量方面也是经济的。 These algorithms include the alias method , the Fast Loaded Dice Roller , the Knuth–Yao algorithm, the MVN data structure, and more.这些算法包括别名方法Fast Loaded Dice Roller 、Knuth-Yao 算法、MVN 数据结构等。 See my section " Weighted Choice With Replacement " for a survey.有关调查,请参阅我的“ 带替代品的加权选择”部分。


The following C# code implements Michael Vose's version of the alias method, as described in this article ;以下 C# 代码实现了 Michael Vose 版本的别名方法,如本文所述 see also this question .另见这个问题 I have written this code for your convenience and provide it here.为了您的方便,我编写了此代码并在此处提供。

public class LoadedDie {
    // Initializes a new loaded die.  Probs
    // is an array of numbers indicating the relative
    // probability of each choice relative to all the
    // others.  For example, if probs is [3,4,2], then
    // the chances are 3/9, 4/9, and 2/9, since the probabilities
    // add up to 9.
    public LoadedDie(int probs){
        this.prob=new List<long>();
        this.alias=new List<int>();
        this.total=0;
        this.n=probs;
        this.even=true;
    }
    
    Random random=new Random();
    
    List<long> prob;
    List<int> alias;
    long total;
    int n;
    bool even;

    public LoadedDie(IEnumerable<int> probs){
        // Raise an error if nil
        if(probs==null)throw new ArgumentNullException("probs");
        this.prob=new List<long>();
        this.alias=new List<int>();
        this.total=0;
        this.even=false;
        var small=new List<int>();
        var large=new List<int>();
        var tmpprobs=new List<long>();
        foreach(var p in probs){
            tmpprobs.Add(p);
        }
        this.n=tmpprobs.Count;
        // Get the max and min choice and calculate total
        long mx=-1, mn=-1;
        foreach(var p in tmpprobs){
            if(p<0)throw new ArgumentException("probs contains a negative probability.");
            mx=(mx<0 || p>mx) ? P : mx;
            mn=(mn<0 || p<mn) ? P : mn;
            this.total+=p;
        }
        // We use a shortcut if all probabilities are equal
        if(mx==mn){
            this.even=true;
            return;
        }
        // Clone the probabilities and scale them by
        // the number of probabilities
        for(var i=0;i<tmpprobs.Count;i++){
            tmpprobs[i]*=this.n;
            this.alias.Add(0);
            this.prob.Add(0);
        }
        // Use Michael Vose's alias method
        for(var i=0;i<tmpprobs.Count;i++){
            if(tmpprobs[i]<this.total)
                small.Add(i); // Smaller than probability sum
            else
                large.Add(i); // Probability sum or greater
        }
        // Calculate probabilities and aliases
        while(small.Count>0 && large.Count>0){
            var l=small[small.Count-1];small.RemoveAt(small.Count-1);
            var g=large[large.Count-1];large.RemoveAt(large.Count-1);
            this.prob[l]=tmpprobs[l];
            this.alias[l]=g;
            var newprob=(tmpprobs[g]+tmpprobs[l])-this.total;
            tmpprobs[g]=newprob;
            if(newprob<this.total)
                small.Add(g);
            else
                large.Add(g);
        }
        foreach(var g in large)
            this.prob[g]=this.total;
        foreach(var l in small)
            this.prob[l]=this.total;
    }
    
    // Returns the number of choices.
    public int Count {
        get {
            return this.n;
        }
    }
    // Chooses a choice at random, ranging from 0 to the number of choices
    // minus 1.
    public int NextValue(){
        var i=random.Next(this.n);
        return (this.even || random.Next((int)this.total)<this.prob[i]) ? I : this.alias[i];
    }
}

Example:例子:

 var loadedDie=new LoadedDie(new int[]{150,40,15,3}); // list of probabilities for each number:
                                                      // 0 is 150, 1 is 40, and so on
 int number=loadedDie.nextValue(); // return a number from 0-3 according to given probabilities;
                                   // the number can be an index to another array, if needed

I place this code in the public domain.我将此代码放在公共领域。

I know this is an old post, but I also searched for such a generator and was not satisfied with the solutions I found.我知道这是一篇旧帖子,但我也搜索过这样的生成器,但对我找到的解决方案并不满意。 So I wrote my own and want to share it to the world.所以我写了我自己的,想分享给全世界。

Just call "Add(...)" some times before you call "NextItem(...)"只需在调用“NextItem(...)”之前多次调用“Add(...)”

/// <summary> A class that will return one of the given items with a specified possibility. </summary>
/// <typeparam name="T"> The type to return. </typeparam>
/// <example> If the generator has only one item, it will always return that item. 
/// If there are two items with possibilities of 0.4 and 0.6 (you could also use 4 and 6 or 2 and 3) 
/// it will return the first item 4 times out of ten, the second item 6 times out of ten. </example>
public class RandomNumberGenerator<T>
{
    private List<Tuple<double, T>> _items = new List<Tuple<double, T>>();
    private Random _random = new Random();

    /// <summary>
    /// All items possibilities sum.
    /// </summary>
    private double _totalPossibility = 0;

    /// <summary>
    /// Adds a new item to return.
    /// </summary>
    /// <param name="possibility"> The possibility to return this item. Is relative to the other possibilites passed in. </param>
    /// <param name="item"> The item to return. </param>
    public void Add(double possibility, T item)
    {
        _items.Add(new Tuple<double, T>(possibility, item));
        _totalPossibility += possibility;
    }

    /// <summary>
    /// Returns a random item from the list with the specified relative possibility.
    /// </summary>
    /// <exception cref="InvalidOperationException"> If there are no items to return from. </exception>
    public T NextItem()
    {
        var rand = _random.NextDouble() * _totalPossibility;
        double value = 0;
        foreach (var item in _items)
        {
            value += item.Item1;
            if (rand <= value)
                return item.Item2;
        }
        return _items.Last().Item2; // Should never happen
    }
}

Use my method.用我的方法。 It is simple and easy-to-understand.它简单易懂。 I don't count portion in range 0...1, i just use "Probabilityp Pool" (sounds cool, yeah?)我不计算 0...1 范围内的部分,我只使用“Probabilityp Pool”(听起来很酷,是吗?)

At circle diagram you can see weight of every element in pool在圆图中,您可以看到池中每个元素的权重

Here you can see an implementing of accumulative probability for roulette在这里你可以看到轮盘赌的累积概率的实现

`

// Some c`lass or struct for represent items you want to roulette
public class Item
{
    public string name; // not only string, any type of data
    public int chance;  // chance of getting this Item
}

public class ProportionalWheelSelection
{
    public static Random rnd = new Random();

    // Static method for using from anywhere. You can make its overload for accepting not only List, but arrays also: 
    // public static Item SelectItem (Item[] items)...
    public static Item SelectItem(List<Item> items)
    {
        // Calculate the summa of all portions.
        int poolSize = 0;
        for (int i = 0; i < items.Count; i++)
        {
            poolSize += items[i].chance;
        }

        // Get a random integer from 0 to PoolSize.
        int randomNumber = rnd.Next(0, poolSize) + 1;

        // Detect the item, which corresponds to current random number.
        int accumulatedProbability = 0;
        for (int i = 0; i < items.Count; i++)
        {
            accumulatedProbability += items[i].chance;
            if (randomNumber <= accumulatedProbability)
                return items[i];
        }
        return null;    // this code will never come while you use this programm right :)
    }
}

// Example of using somewhere in your program:
        static void Main(string[] args)
        {
            List<Item> items = new List<Item>();
            items.Add(new Item() { name = "Anna", chance = 100});
            items.Add(new Item() { name = "Alex", chance = 125});
            items.Add(new Item() { name = "Dog", chance = 50});
            items.Add(new Item() { name = "Cat", chance = 35});

            Item newItem = ProportionalWheelSelection.SelectItem(items);
        }

Here's an implementation using the Inverse distribution function :这是使用逆分布函数的实现

using System;
using System.Linq;

    // ...
    private static readonly Random RandomGenerator = new Random();

    private int GetDistributedRandomNumber()
    {
        double totalCount = 208;

        var number1Prob = 150 / totalCount;
        var number2Prob = (150 + 40) / totalCount;
        var number3Prob = (150 + 40 + 15) / totalCount;

        var randomNumber = RandomGenerator.NextDouble();

        int selectedNumber;

        if (randomNumber < number1Prob)
        {
            selectedNumber = 1;
        }
        else if (randomNumber >= number1Prob && randomNumber < number2Prob)
        {
            selectedNumber = 2;
        }
        else if (randomNumber >= number2Prob && randomNumber < number3Prob)
        {
            selectedNumber = 3;
        }
        else
        {
            selectedNumber = 4;
        }

        return selectedNumber;
    }

An example to verify the random distribution:验证随机分布的示例:

        int totalNumber1Count = 0;
        int totalNumber2Count = 0;
        int totalNumber3Count = 0;
        int totalNumber4Count = 0;

        int testTotalCount = 100;

        foreach (var unused in Enumerable.Range(1, testTotalCount))
        {
            int selectedNumber = GetDistributedRandomNumber();

            Console.WriteLine($"selected number is {selectedNumber}");

            if (selectedNumber == 1)
            {
                totalNumber1Count += 1;
            }

            if (selectedNumber == 2)
            {
                totalNumber2Count += 1;
            }

            if (selectedNumber == 3)
            {
                totalNumber3Count += 1;
            }

            if (selectedNumber == 4)
            {
                totalNumber4Count += 1;
            }
        }

        Console.WriteLine("");
        Console.WriteLine($"number 1 -> total selected count is {totalNumber1Count} ({100 * (totalNumber1Count / (double) testTotalCount):0.0} %) ");
        Console.WriteLine($"number 2 -> total selected count is {totalNumber2Count} ({100 * (totalNumber2Count / (double) testTotalCount):0.0} %) ");
        Console.WriteLine($"number 3 -> total selected count is {totalNumber3Count} ({100 * (totalNumber3Count / (double) testTotalCount):0.0} %) ");
        Console.WriteLine($"number 4 -> total selected count is {totalNumber4Count} ({100 * (totalNumber4Count / (double) testTotalCount):0.0} %) ");

Example output:示例输出:

 selected number is 1 selected number is 1 selected number is 1 selected number is 1 selected number is 2 selected number is 1 ... selected number is 2 selected number is 3 selected number is 1 selected number is 1 selected number is 1 selected number is 1 selected number is 1 number 1 -> total selected count is 71 (71.0 %) number 2 -> total selected count is 20 (20.0 %) number 3 -> total selected count is 8 (8.0 %) number 4 -> total selected count is 1 (1.0 %)

Thanks for all your solutions guys!感谢您的所有解决方案! Much appreciated!非常感激!

@Menjaraz I tried implementing your solution as it looks very resource friendly, however had some difficulty with the syntax. @Menjaraz 我尝试实施您的解决方案,因为它看起来对资源非常友好,但是在语法上有一些困难。

So for now, I just transformed my summary into a flat list of values using LINQ SelectMany() and Enumerable.Repeat().所以现在,我只是使用 LINQ SelectMany() 和 Enumerable.Repeat() 将我的摘要转换为一个平面的值列表。

public class InventoryItemQuantityRandomGenerator
{
    private readonly Random _random;
    private readonly IQueryable<int> _quantities;

    public InventoryItemQuantityRandomGenerator(IRepository database, int max)
    {
        _quantities = database.AsQueryable<ReceiptItem>()
            .Where(x => x.Quantity <= max)
            .GroupBy(x => x.Quantity)
            .Select(x => new
                             {
                                 Quantity = x.Key,
                                 Count = x.Count()
                             })
            .SelectMany(x => Enumerable.Repeat(x.Quantity, x.Count));

        _random = new Random();
    }

    public int Next()
    {
        return _quantities.ElementAt(_random.Next(0, _quantities.Count() - 1));
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM