简体   繁体   中英

Calculating Mean Squared Error with Sample Mean

I was given this assignment, and I'm not sure if i understand the question correctly.

We considered the sample-mean estimator for the distribution mean. Another estimator for the distribution mean is the min-max-mean estimator that takes the mean (average) of the smallest and largest observed values. For example, for the sample {1, 2, 1, 5, 1}, the sample mean is (1+2+1+5+1)/5=2 while the min-max-mean is (1+5)/2=3. In this problem we ask you to run a simulation that approximates the mean squared error (MSE) of the two estimators for a uniform distribution.

Take a continuous uniform distribution between a and b - given as parameters. Draw a 10-observation sample from this distribution, and calculate the sample-mean and the min-max-mean. Repeat the experiment 100,000 times, and for each estimator calculate its average bias as your MSE estimates.

Sample Input: Sample_Mean_MSE(1, 5) Sample Output: 0.1343368663225577

This code below is me trying to:

  1. Draw a sample of size 10 from a uniform distribution of a and b
  2. calculate MSE, with mean calculated with Sample Mean method.
  3. Repeat 100,000 times, and store the result MSEs in an array
  4. Return the mean of the MSEs array, as the final result

However, the result I get was quite far from the sample output above. Can someone clarify the assignment, around the part "Repeat the experiment 100,000 times, and for each estimator calculate its average bias as your MSE estimates"? Thanks

import numpy as np

def Sample_Mean_MSE(a, b):
    # inputs: bounds for uniform distribution a and b
    # sample size is 10
    # number of experiments is 100,000
    # output: MSE for sample mean estimator with sample size 10
    mse_s = np.array([])
    k = 0
    while k in range(100000):
        sample = np.random.randint(low=a, high=b, size=10)
        squared_errors = np.array([])
        for i, value in enumerate(sample):
            error = value - sample.mean()
            squared_errors = np.append(squared_errors, error ** 2)
        k += 1
        mse_s = np.append(mse_s, squared_errors.mean())

    return mse_s.mean()


print(Sample_Mean_MSE(1, 5))

To get the expected result we first need to understand what the Mean squared error (MSE) of an estimator is. Take the sample-mean estimator for example (min-max-mean estimator is basically the same):

MSE assesses the average squared difference between the observed and predicted values - in this case, is between the distribution mean (from all 100,000 drawn samples) and the sample-mean (from each sample). So we can break it down as below:

  1. Draw a sample of size 10 and calculate the sample mean (ŷ)
  2. Repeat n = 100,000 times and calculate the mean of all drawn samples (y)
  3. Calculate the MSE:

MSE = 1/n * Σ(y - ŷ)^2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM