简体   繁体   中英

How do I generate random vectors from a bar graph of probabilities?

I have generated a bar graph that counts the number of 1's in each bit of a 16-digit binary string:

在此输入图像描述

I would like to generate 300 binary vectors of 16 bits that roughly follows the above distribution.

Initially, my code depended on a probability array gengen , which counts the number of times 1 appears in each bit. I generated a random matrix of 300x16 values, compared each bit of value to the probability and assigned it to 1 and 0. This is similar to the weighted coin approach.

However, the distribution I received was nearly uniform.

gengen = [30, 28, 30, 30, 30, 26, 28, 28, 29, 23, 17, 8, 10, 12, 7, 6]

len = 16; % string length

% probability for 1 in each bit
prob_gengen = gengen./30;
% generate 100 random strings 
moreStrings = rand(300,len);
samplemore = []
for i = 1:300
  for k = 1:16
    disp(i)
    disp(k)
    if (prob_gengen(k) > moreStrings(i,k))
        samplemore(i,k) = 1;
    else
        samplemore(i,k) = 0;
    end
  end
end
G = reshape(samplemore,[16,300])

And this code plots the final distribution:

colormap('default')
subplot(3,1,1)
bar(sum(G,2)); % summing along rows using DIM = 2
xlabel('Frequency Bin ');
title('Generator (16b) Generated');

在此输入图像描述

What can I do to have a distribution similar to the first bar chart? The code itself is in MATLAB, but I think a Python implementation can work as well.

I believe the main source of the error is the reshaping step you do to get G . If you want to reorganize your data from a 300-by-16 matrix to a 16-by-300 matrix, you should transpose it. Using reshape breaks up your old 300-element columns and spreads them out across your new 16-element columns.

However, you can do all this without loops using bsxfun :

gengen = [30, 28, 30, 30, 30, 26, 28, 28, 29, 23, 17, 8, 10, 12, 7, 6];
len = 16;
prob_gengen = gengen./30;
binvecs = bsxfun(@le, rand(300, len), prob_gengen);
bar(sum(binvecs, 1));

And here's the bar graph, which looks a lot like your first bar graph above:

在此输入图像描述

How it works...

First you create a 300-by-16 set of uniformly-generated random values between 0 and 1 with rand . Next, you'll want to check the random values in each column against the probability in the prob_gengen vector. For example, the 16 th column has a probability of 0.2, which means (on average) you want one fifth of the values in that column to be 1 and four fifths to be zero. You can do this by simply checking if the randomly-generated value is less than or equal to this probability threshold. Using bsxfun expands the vector prob_gengen as needed to do the comparisons for all rows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM