简体   繁体   English

用样本均值计算均方误差

[英]Calculating Mean Squared Error with Sample Mean

I was given this assignment, and I'm not sure if i understand the question correctly.我接到了这个任务,我不确定我是否正确理解了这个问题。

We considered the sample-mean estimator for the distribution mean.我们考虑了分布均值的样本均值估计量。 Another estimator for the distribution mean is the min-max-mean estimator that takes the mean (average) of the smallest and largest observed values.分布均值的另一个估计量是 min-max-mean 估计量,它采用最小和最大观测值的均值(平均值)。 For example, for the sample {1, 2, 1, 5, 1}, the sample mean is (1+2+1+5+1)/5=2 while the min-max-mean is (1+5)/2=3.例如,对于样本{1, 2, 1, 5, 1},样本均值为(1+2+1+5+1)/5=2,而min-max-mean为(1+5) /2=3。 In this problem we ask you to run a simulation that approximates the mean squared error (MSE) of the two estimators for a uniform distribution.在这个问题中,我们要求您运行一个模拟,该模拟近似于均匀分布的两个估计量的均方误差 (MSE)。

Take a continuous uniform distribution between a and b - given as parameters.取 a 和 b 之间的连续均匀分布 - 作为参数给出。 Draw a 10-observation sample from this distribution, and calculate the sample-mean and the min-max-mean.从该分布中抽取 10 个观测样本,并计算样本均值和最小最大均值。 Repeat the experiment 100,000 times, and for each estimator calculate its average bias as your MSE estimates.重复实验 100,000 次,并为每个估计器计算其平均偏差作为您的 MSE 估计值。

Sample Input: Sample_Mean_MSE(1, 5) Sample Output: 0.1343368663225577样本输入:Sample_Mean_MSE(1, 5) 样本输出:0.1343368663225577

This code below is me trying to:下面的代码是我试图:

  1. Draw a sample of size 10 from a uniform distribution of a and b从 a 和 b 的均匀分布中抽取大小为 10 的样本
  2. calculate MSE, with mean calculated with Sample Mean method.计算 MSE,使用样本均值方法计算均值。
  3. Repeat 100,000 times, and store the result MSEs in an array重复 100,000 次,并将结果 MSE 存储在数组中
  4. Return the mean of the MSEs array, as the final result返回 MSEs 数组的均值,作为最终结果

However, the result I get was quite far from the sample output above.但是,我得到的结果与上面的示例输出相去甚远。 Can someone clarify the assignment, around the part "Repeat the experiment 100,000 times, and for each estimator calculate its average bias as your MSE estimates"?有人可以在“重复实验 100,000 次,并为每个估计器计算其平均偏差作为您的 MSE 估计值”部分澄清分配吗? Thanks谢谢

import numpy as np

def Sample_Mean_MSE(a, b):
    # inputs: bounds for uniform distribution a and b
    # sample size is 10
    # number of experiments is 100,000
    # output: MSE for sample mean estimator with sample size 10
    mse_s = np.array([])
    k = 0
    while k in range(100000):
        sample = np.random.randint(low=a, high=b, size=10)
        squared_errors = np.array([])
        for i, value in enumerate(sample):
            error = value - sample.mean()
            squared_errors = np.append(squared_errors, error ** 2)
        k += 1
        mse_s = np.append(mse_s, squared_errors.mean())

    return mse_s.mean()


print(Sample_Mean_MSE(1, 5))

To get the expected result we first need to understand what the Mean squared error (MSE) of an estimator is.要获得预期结果,我们首先需要了解估计量的均方误差 (MSE) 是什么。 Take the sample-mean estimator for example (min-max-mean estimator is basically the same):以sample-mean estimator为例(min-max-mean estimator基本相同):

MSE assesses the average squared difference between the observed and predicted values - in this case, is between the distribution mean (from all 100,000 drawn samples) and the sample-mean (from each sample). MSE 评估观察值和预测值之间的平均平方差 - 在这种情况下,分布平均值(来自所有 100,000 个抽取样本)和样本均值(来自每个样本)之间。 So we can break it down as below:所以我们可以把它分解如下:

  1. Draw a sample of size 10 and calculate the sample mean (ŷ)抽取大小为 10 的样本并计算样本均值 (ŷ)
  2. Repeat n = 100,000 times and calculate the mean of all drawn samples (y)重复 n = 100,000 次并计算所有抽取样本的平均值 (y)
  3. Calculate the MSE:计算 MSE:

MSE = 1/n * Σ(y - ŷ)^2 MSE = 1/n * Σ(y - ŷ)^2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM