简体   繁体   English

Python 计算置信区间

[英]Python computing confidence intervals

I'm trying to compute 10,000 upper and lower bound confidence intervals using the following script:我正在尝试使用以下脚本计算 10,000 个上限和下限置信区间:

import numpy as np
import statistics as stat
from matplotlib import pyplot as plt

N = 10000
sigma = 1
mu = 10
n = 10

X = []
for i in range (N):
    X.append(mu + sigma*(np.random.normal(size=n)))
    Xbar = np.mean(X, axis=0)
    lower_CI = Xbar - 1.96*sigma/np.sqrt(n)
    upper_CI = Xbar + 1.96*sigma/np.sqrt(n)

After computing the intervals I need to find the fraction of intervals that include mu = 10. However, I only get 10 upper and lower bound intervals, not 10,000.在计算间隔后,我需要找到包含 mu = 10 的间隔分数。但是,我只得到 10 个上限和下限间隔,而不是 10,000 个。 Also, Xbar has 10 values.此外,Xbar 有 10 个值。 Why does it not have only one value since it is the mean of X?为什么它不是只有一个值,因为它是 X 的平均值?

Where am I going wrong on this?我在哪里错了?

First of all, the following is happening in the code you posted: mu + sigma*(np.random.normal(size=n)) gives you an array of n=10 samples of a the normal distribution with your mu and sigma.首先,您发布的代码中发生了以下情况: mu + sigma*(np.random.normal(size=n))为您提供了一个包含 n=10 个正态分布样本的数组,其中包含 mu 和 sigma。 X.append(mu + sigma*(np.random.normal(size=n))) adds this to your X list, so that X becomes a list of arrays. X.append(mu + sigma*(np.random.normal(size=n)))将此添加到您的 X 列表中,以便 X 成为 arrays 的列表。 Xbar = np.mean(X, axis=0) With axis=0, you tell numpy to calculate the mean along axis 0 of your list of arrays. Xbar = np.mean(X, axis=0)如果axis=0,您告诉numpy 计算arrays 列表中沿轴0 的平均值。 This means calculating a mean for each index of the arrays in your list of arrays X which is why Xbar has 10 entries.这意味着计算 arrays X 列表中 arrays 的每个索引的平均值,这就是 Xbar 有 10 个条目的原因。 lower_CI = Xbar - 1.96*sigma/np.sqrt(n) here you set lower_CI to Xbar minus some sigma based number for your confidence interval. lower_CI = Xbar - 1.96*sigma/np.sqrt(n)在这里您将 lower_CI 设置为 Xbar 减去一些基于 sigma 的置信区间数。 Because Xbar has 10 entries, the result will also have 10 entries.因为 Xbar 有 10 个条目,所以结果也将有 10 个条目。 However, because you overwrite lower_CI in every iteration of the for loop instead of storing the values in a list, you do not get 10,000 lower bounds.但是,因为您在 for 循环的每次迭代中都覆盖了 lower_CI,而不是将值存储在列表中,所以您不会得到 10,000 个下限。

I am not completely sure what you are trying to do exactly, but the following code would approximate lower and upper confidence bounds for mu and sigma 10,000 times.我不完全确定您到底要做什么,但以下代码将近似于 mu 和 sigma 10,000 次的置信上限和下限。

lower_CIs = []
upper_CIs = []
for i in range (N):
    X = mu + sigma*(np.random.normal(size=n))
    Xbar = np.mean(X)
    lower_CI = Xbar - 1.96*sigma
    upper_CI = Xbar + 1.96*sigma
    lower_CIs.append(lower_CI)
    upper_CIs.append(upper_CI)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM