模擬中錯誤的樣本空間

Question

在Twitter中，我遇到了一個像這樣的難題：

有十個硬幣，每個空白在一側，在另一側編號為1到10。將所有10個硬幣扔掉，並計算出正面朝上的數字總和。 該總和至少為45的概率是多少？

我想建立一個模擬來重現分析解決方案的模擬，它是43/1024，當然會有一些相當小的誤差

我的第一次嘗試：

# Create a list of all possible values
values <- c(1:10, rep(0,10))

# Number of trials
nt <- 5e5

# Container vector to store the results
output <- vector(mode = "numeric", length = nt)

# set seed for reproducible result
ns <- 42

# Loop
for (i in 1:nt) {
  set.seed(ns)
  temp <- sample(x = values, size = 10, replace = F)
  output[i] <- sum(temp)
  ns <- ns + 1
}

length(output[output > 44]) / nt


# [1] 0.013736

第二次嘗試：

rm(list = ls())

values <- 1:10
nt <- 5e5
output <- vector(mode = "numeric", length = nt)
ns <- 42

for (i in 1:nt) {
  set.seed(ns)
  temp <- sample(x = c(0,1), size = 10, replace = T)
  output[i] <- sum(temp * values)
  ns <- ns + 1
}

length(output[output > 44]) / nt

# [1] 0.042038

# Find the fraction (X / 1024) which trows minimal error

sim.res <- length(output[output > 44]) / nt
fractions <- (1:2^10)/2^10
difference <- abs(sim.res - fractions)
which(difference == min(difference))

# [1] 43

顯然，這兩種方法之間的唯一區別是用於模擬的樣本空間的構造。 我盯着代碼，無法弄清楚為什么數字1是錯誤的。 就我所知，他們應該表現相同。

Answer 1

您的第一個（不正確的）模擬只是從向量中選擇了10個數字而不進行替換：

c(1:10, rep(0, 10))
#>  [1]  1  2  3  4  5  6  7  8  9 10  0  0  0  0  0  0  0  0  0  0

（刪除了獲得10 0的概率的錯誤計算。）

假設您的10個選擇中的第一個不是0 。 也許是7 。 這意味着第二個選擇為0的概率現在為10/19。

但是在所述的實際情況下，0-7標記的硬幣是否升為7並不重要，其他硬幣升為0的概率仍然是1/2，因為硬幣的結果是獨立的。

順便說一下， replicate功能幾乎是為仿真而構建的。 這是編寫此模擬的R方法：

nt <- 1e5
sims <- replicate(nt, sample(0:1, 10, repl=TRUE)*(1:10))

那么，所討論的概率為：

p_gte_45 <- mean(apply(sims, 2, function(x) sum(x) >= 45))

模擬中錯誤的樣本空間

問題描述

1 個解決方案

解決方案1
0 2018-01-31 14:30:31

模擬中錯誤的樣本空間

問題描述

1 個解決方案

解決方案1 0 2018-01-31 14:30:31

解決方案1
0 2018-01-31 14:30:31