Following is what I am trying to do:
I have got a bunch of items, lets say from A to J, a total of 10 items. Now I want to generate a total of 20 draws and in each draw I need 3 items from the above 10 items. Now if the first item comes out as A, it should not show up in second and third item irrespective of its assigned probability.
Lets say:
A - 4%
B - 20%
C - 1%
D - 16%
E - 5%
F - 7%
G - 3%
H - 21%
I - 6%
J - 17%
Now, I need to randomly generate 3 items from the above list in each draw, according to their assigned probabilities, but lets say if first item is B, the second and third item should not be B. I should repeat the same process for 20 draws.
Answer Should Look something like this:
1st Item 2nd Item 3rd Item
1st Draw B D J
2nd Draw D E F
3rd Draw B H G
The numbers should be generated according to their assigned probabilities.
Thanks in advance.
For a formula route:
You will need to build two helper columns. The first is a running total
I put your values in G1:H10
Then in I1 I put 1
In I2 I put:
=I1+(H1*100)
And copied down:
I then created the second helper. In K1 I put:
=INDEX(G:G,MATCH(ROW(1:1),I:I))
And copied down 100 rows.
This created a dynamic range of the probability.
Then in B2 I put:
=INDEX($K:$K,AGGREGATE(15,6,ROW($1:$100)/(COUNTIF($A2:A2,$K$1:$K$100)=0),RANDBETWEEN(1,100-SUMPRODUCT(COUNTIF($A2:A2,$K$1:$K$100)))))
Copied over three and down as many as wanted:
Caveats:
20.513%
. 1%
. 100%
Here is another way according to this website ( https://www.mrexcel.com/forum/excel-questions/372071-random-numbers-assigned-probabilities.html ) post #7. You can use cumulative value and do the similar test.
Add a helper column C
and use this formula: =SUM($B$2:B2)
, and drag down.
On cell F2
, you can enter this formula:
=INDEX($A$2:$A$11,COUNTIF($C$2:$C$11,"<="&RAND())+1)
It is basically counting the rows using RAND
function and add 1 (the header row) to pick the item. Give it a try and let me know.
Here's something that combines the approaches in the two earlier answers. Like ian0411's answer it utilises the cumulative probability distribution. It also utilises Scott Craner's techique of constructing an array of 0's and 1's which indicate matches between the possible outcomes (A,...,J) and those which have already been drawn.
The formula in cell P2
is
=INDEX($A$2:$A$11,1+IFERROR(MATCH(RAND(),MMULT($D$2:$M$11,($B$2:$B$11)*(1-COUNTIF($O2:O2,$A$2:$A$11))/SUMPRODUCT($B$2:$B$11,(1-COUNTIF($O2:O2,$A$2:$A$11))))),0))
and this is copied into cells Q2
and R2
and then dragged down for each draw.
The probability distribution in $B$2:$B$11
has elements replaced by zeroes for any prior drawn items in the current draw (hat-tip to Scott for how this was achieved). The adjusted distribution is converted to a cumulative format via the kludgy matrix multiplication operation (couldn't think of a more elegant approach) and normalised to a "proper" cumulative distribution by dividing by the sum of its elements. Rather than using =1+COUNTIF(cumdist,"<="&RAND())
(where cumdist
is the cumulative distribution) to pick out the element of cumdist
matching to the random variable, I have used an alternative of =1+IFERROR(MATCH(RAND(),cumdist),0)
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.