简体   繁体   中英

Variance calculated with two methods returns different results in Python

I wanted to calculate the variance for a distribution of discrete values using two different methods, to prove they return identical results:

1. σ**2 = <j**2> - <j>**2
2. σ**2 = <(Δj)**2> = Σ(Δj)**2 *P(j)

Here's my code:

j = [14,15,16,22,24,25]
Nj = [1,1,3,2,2,5]
N = sum(Nj)

Pj = [Nj[i]/N for i in range(len(j))]

j_mean = sum(Pj[i]*j[i] for i in range(len(j)))
j_sqmean = sum(Pj[i]*j[i]**2 for i in range(len(j)))

var1 = j_mean**2 - j_sqmean
var2 = sum((j[i]-j_mean)*Nj[i] for i in range(len(j)))

print(var1,var2)

For some reason var1 != var2 is the result and I can't figure out where I'm going wrong with my code.

You have your two formulas wrong. Change it to:

var1 = j_sqmean -j_mean**2 
var2 = sum((j[i]-j_mean)**2 * Pj[i] for i in range(len(j)))

print(var1,var2)
# 18.571428571428555 18.57142857142857

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM