简体   繁体   中英

Calculate probabilities in R programming

I'm new to R and I'm doing a practice question on it. Calculate the probability of drawing two face cards (Jack, Queen, King) in a row. Simulate a standard deck of 52 cards (no Jokers). Sample two cards from the deck 1000 times (remember, we do not replace the card after drawing). How does the proportion of times two face cards were drawn compare to the probability you calculated? This is what I tried:

 poker <- c(1:10, "J", "Q", "K")
 
 poker_face <- sample(poker, size = 1000, replace = FALSE)

it gives me:

Error in sample.int(length(x), size, replace, prob) : cannot take a sample larger than the population when 'replace = FALSE'

Instead of taking 2 cards out of your deck without replacing, your code is trying to take 1000 cards out without putting any back. Since the deck doesn't have 1000 cards to draw, it can't take the sample.

To illustrate, try reducing 1000 to a smaller number (like 2) and see if the error goes away. You'll want to replicate that test 1000 times.

Also note that your deck is 13 cards instead of 52. If you were just taking a single card out (or replacing it afterward) it wouldn't affect the odds; it's still even odds of drawing any given value. But since you're sampling two cards without replacement, you'll need a full deck.

Say the first card drawn is a king. Now instead of having 3 kings and 4 of everything else, you're making the second draw with no kings available and 1 of everything else.

I think this is what you really want:

poker_face <- replicate(1000, sample(poker, size =2, replace = FALSE))

You want to repeat the experiment 1000 times, not sample 1000 cards without replacement from a deck. So there is a conceptual misunderstanding here. replicate above would give you a matrix of 2 rows and 1000 columns, where each column is the result of one out of 1000 experiments.

To compute the probability you want, you need the number of simulations that give you face cards. How about:

m <- sum(colSums(matrix(poker_face %in% c("J", "Q", "K"), nrow = 2)) == 2)

Then m/1000 is the estimated probability based on your simulations.

poker is a vector of length 13. You are attempting to take a sample of size 1000 from 13 without replacement. The question asks for a sample of size 2, 1000 times.

Try the following line of code...

sample(poker, size = 2, replace = FALSE)

...then repeat this function 1000 times to obtain the proportion of times two face cards were drawn.

Because I can't make comments yet, I wanted to add on to Zheyuan Li's answer and explain %in%, which you were asking about.

%in% sets up a logical match selection, so it returns true/false for every cell in the matrix which contains one of the characters in the c() list.

Another way to think of it is if you compare it to this grepl() statement:

m <- sum(colSums(matrix(grepl("J|Q|K", poker_face), nrow = 2)) == 2)

It's the same as the original line of code:

m <- sum(colSums(matrix(poker_face %in% c("J", "Q", "K"), nrow = 2)) == 2)

Except I'm using grepl() to tell me whether cells in the matrix match either a "J", "Q", or "K".

You can get more on %in% by looking up ?match()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM