根據 null 假設測試 80,000 多個模擬正態分布觀察集

Question

我需要從方差為 1 和我指定的真實 mu（平均值）的正態分布中生成大小為 200（n=200）的隨機樣本； 然后，我根據一個假設測試平局：mu <= 1。我需要為 400 個潛在的真實 theta 中的每一個執行此操作，並且對於每個真實的 theta，我需要復制 200 次。

我已經為 n=1 做了這個，但我意識到我的方法是不可復制的。 對於每 400 個 theta，我運行以下命令：

sample_r200n1_t2=normal(loc=-0.99, scale=1, size=200)
sample_r200n1_t3=normal(loc=-0.98, scale=1, size=200)
sample_r200n1_t4=normal(loc=-0.97, scale=1, size=200)
sample_r200n1_t5=normal(loc=-0.96, scale=1, size=200)
... on and on to loc = 3

然后，我分別測試了生成數組中的每個元素。 然而，這種方法需要我生成數以萬計的樣本，我生成與每個樣本相關的平均值，然后根據我的標准測試該平均值。 這必須完成 80,000 次（除此之外，我還需要針對多個不同大小的 n 執行此操作）。 顯然 - 這不是采取的方法。

我怎樣才能達到我想要的結果？ 例如，有沒有辦法生成一組樣本均值並將這些均值放入一個數組中，每個 theta 一個？ 然后我可以像以前一樣測試。 或者，還有其他方法嗎？

Answer 1

您可以在 numpy 數組中生成所有 200 200 400 = 1600 萬個隨機值（這會消耗約 122 兆字節的存儲空間（請查看draws.nbytes/1024/1024 ），並使用 scipy 運行單面樣本- 對每個 theta 值的 200 個觀測值的 200 個樣本中的每一個進行測試：

from numpy.random import normal
from scipy.stats import ttest_1samp
import matplotlib.pyplot as plt

# Array of loc values; for each loc, we draw 200 
# samples of 200 normally distributed observations
locs = np.linspace(-1, 3, 401)

# Array of shape (401, 200, 200) = (locs, samples, observations)
# Note that 200 draws of 200 i.i.d. observations is the same as
# 1 draw of 200*200 i.i.d. observations, reshaped to (200, 200)
draws = np.array([normal(loc=x, scale=1, size=200*200)
                  for x in locs]).reshape(401, 200, 200)

# axis=1 computes t-test across columns.
# Alternative hypothesis that sample mean
# is less than the population mean of 1 implies a null
# hypothesis that sample mean is greater than or equal to
# the population mean
tstats, pvals = ttest_1samp(draws, 1, alternative='less', axis=1)

# Count how many out of 200 t-tests reject the null hypothesis
# at the alpha=0.05 level
rejects = (pvals < 0.05).sum(axis=1)

# Visual check: p-values should be low for sample means
# far below 1, as these tests should reject the null 
# hypothesis that sample mean >= 1
plt.plot(locs, rejects)
plt.axvline(1, c='r')
plt.title('Number of t-tests rejecting $H_0 : \mu \geq 1$ with $p < 0.05$')
plt.xlabel('Known sample mean $\\theta$')

根據 null 假設測試 80,000 多個模擬正態分布觀察集

問題描述

1 個解決方案

解決方案1
0 已采納 2021-12-19 00:47:26

根據 null 假設測試 80,000 多個模擬正態分布觀察集

問題描述

1 個解決方案

解決方案1 0 已采納 2021-12-19 00:47:26

解決方案1
0 已采納 2021-12-19 00:47:26