Variance calculated with two methods returns different results in Python

Question

I wanted to calculate the variance for a distribution of discrete values using two different methods, to prove they return identical results:

1. σ**2 = <j**2> - <j>**2
2. σ**2 = <(Δj)**2> = Σ(Δj)**2 *P(j)

Here's my code:

j = [14,15,16,22,24,25]
Nj = [1,1,3,2,2,5]
N = sum(Nj)

Pj = [Nj[i]/N for i in range(len(j))]

j_mean = sum(Pj[i]*j[i] for i in range(len(j)))
j_sqmean = sum(Pj[i]*j[i]**2 for i in range(len(j)))

var1 = j_mean**2 - j_sqmean
var2 = sum((j[i]-j_mean)*Nj[i] for i in range(len(j)))

print(var1,var2)

For some reason var1 != var2 is the result and I can't figure out where I'm going wrong with my code.

Answer 1

You have your two formulas wrong. Change it to:

var1 = j_sqmean -j_mean**2 
var2 = sum((j[i]-j_mean)**2 * Pj[i] for i in range(len(j)))

print(var1,var2)
# 18.571428571428555 18.57142857142857

Variance calculated with two methods returns different results in Python

Question

1 answers

solution1
1 ACCPTED 2019-08-30 16:56:32

Variance calculated with two methods returns different results in Python

Question

1 answers

solution1 1 ACCPTED 2019-08-30 16:56:32

solution1
1 ACCPTED 2019-08-30 16:56:32