简体   繁体   English

如何从概率条形图生成随机向量?

[英]How do I generate random vectors from a bar graph of probabilities?

I have generated a bar graph that counts the number of 1's in each bit of a 16-digit binary string: 我生成了一个条形图,它计算16位二进制字符串的每一位中的1的数量:

在此输入图像描述

I would like to generate 300 binary vectors of 16 bits that roughly follows the above distribution. 我想生成300个16位的二进制向量,大致遵循上述分布。

Initially, my code depended on a probability array gengen , which counts the number of times 1 appears in each bit. 最初,我的代码依赖于概率数组gengen ,它计算每个位中出现1的次数。 I generated a random matrix of 300x16 values, compared each bit of value to the probability and assigned it to 1 and 0. This is similar to the weighted coin approach. 我生成了一个300x16值的随机矩阵,将每个值的值与概率进行比较,并将其分配给1和0.这类似于加权硬币方法。

However, the distribution I received was nearly uniform. 但是,我收到的分发几乎是统一的。

gengen = [30, 28, 30, 30, 30, 26, 28, 28, 29, 23, 17, 8, 10, 12, 7, 6]

len = 16; % string length

% probability for 1 in each bit
prob_gengen = gengen./30;
% generate 100 random strings 
moreStrings = rand(300,len);
samplemore = []
for i = 1:300
  for k = 1:16
    disp(i)
    disp(k)
    if (prob_gengen(k) > moreStrings(i,k))
        samplemore(i,k) = 1;
    else
        samplemore(i,k) = 0;
    end
  end
end
G = reshape(samplemore,[16,300])

And this code plots the final distribution: 此代码绘制了最终分布:

colormap('default')
subplot(3,1,1)
bar(sum(G,2)); % summing along rows using DIM = 2
xlabel('Frequency Bin ');
title('Generator (16b) Generated');

在此输入图像描述

What can I do to have a distribution similar to the first bar chart? 如何获得类似于第一个条形图的分布? The code itself is in MATLAB, but I think a Python implementation can work as well. 代码本身在MATLAB中,但我认为Python实现也可以工作。

I believe the main source of the error is the reshaping step you do to get G . 我相信错误的主要来源是你为获得G做的重塑步骤。 If you want to reorganize your data from a 300-by-16 matrix to a 16-by-300 matrix, you should transpose it. 如果要将数据从300 x 16矩阵重新组织为16 x 3矩阵,则应转置 Using reshape breaks up your old 300-element columns and spreads them out across your new 16-element columns. 使用reshape分解旧的300个元素列,并将它们分散到新的16个元素列中。

However, you can do all this without loops using bsxfun : 但是,您可以使用bsxfun执行所有这些操作而不使用bsxfun

gengen = [30, 28, 30, 30, 30, 26, 28, 28, 29, 23, 17, 8, 10, 12, 7, 6];
len = 16;
prob_gengen = gengen./30;
binvecs = bsxfun(@le, rand(300, len), prob_gengen);
bar(sum(binvecs, 1));

And here's the bar graph, which looks a lot like your first bar graph above: 这是条形图,看起来很像你上面的第一个条形图:

在此输入图像描述

How it works... 这个怎么运作...

First you create a 300-by-16 set of uniformly-generated random values between 0 and 1 with rand . 首先,使用rand在0和1之间创建一个300乘16的均匀生成的随机值集。 Next, you'll want to check the random values in each column against the probability in the prob_gengen vector. 接下来,您将要检查每列中的随机值与prob_gengen向量中的概率。 For example, the 16 th column has a probability of 0.2, which means (on average) you want one fifth of the values in that column to be 1 and four fifths to be zero. 例如, 16列具有0.2的概率,这意味着(平均)要在该列中的值的五分之一为1个五分之四为零。 You can do this by simply checking if the randomly-generated value is less than or equal to this probability threshold. 您可以通过检查随机生成的值是否小于或等于此概率阈值来执行此操作。 Using bsxfun expands the vector prob_gengen as needed to do the comparisons for all rows. 使用bsxfun根据需要扩展矢量prob_gengen以对所有行进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM