当 x 和 P(x) 已知时，如何计算 Python 中的标准偏差

Question

我有以下数据，我希望计算标准偏差。 给出了 x 的值和 x 的概率。

X	P(x)
-2,000	0.1
-1,000	0.1
0	0.2
1000	0.2
2000	0.3
3000	0.1

我知道如何手动计算它：通过E[x^2] - (E[x])^2计算Var(x)然后取Sqrt(Var(x)) 。

[这是手动完成的]

你如何在 python 中计算它？

Answer 1

澄清一下，如果假设所有 6 个项具有相等的概率分布，则 [1000, 2000, 3000, 0, -1000, -2000] 的标准差确实是 1707.8。

但是在帖子中，这 6 个术语的概率分布不均 [0.1, 0.1, 0.2, 0.2, 0.3, 0.1]

df = pd.DataFrame([
{'x':-2000, 'P(x)':0.1}, 
{'x':-1000, 'P(x)':0.1}, 
{'x':0, 'P(x)':0.2}, 
{'x':1000, 'P(x)':0.2}, 
{'x':2000, 'P(x)':0.3}, 
{'x':3000, 'P(x)':0.1} ])

df['E(x)'] = df['x'] * df['P(x)']        # E(x) = x . P(x)
df['E(x^2)'] = df['x']**2 * df['P(x)']   # E(x^2) = x^2 . P(x)
variance = df['E(x^2)'].sum() - df['E(x)'].sum() **2
std_dev = variance **0.5
display(df)
print('Standard Deviation is: {:.2f}'.format(std_dev))

Output

    x       P(x)    E(x)    E(x^2)
0   -2000   0.1     -200.0  400000.0
1   -1000   0.1     -100.0  100000.0
2   0       0.2     0.0     0.0
3   1000    0.2     200.0   200000.0
4   2000    0.3     600.0   1200000.0
5   3000    0.1     300.0   900000.0
Standard Deviation is: 1469.69

要确认，可以 go 到https://www.rapidtables.com/calc/math/standard-deviation-calculator.html

Answer 2

尝试这个：

import math

df['x_squared'] = df['x']**2
df['E_of_x_squared'] = df['x_squared'] * df['P(x)']
df['E_of_x'] = df['x'] * df['P(x)']

sum_E_x_square = df['E_of_x_squared'].values.sum()
square_of_E_x_sum = df['E_of_x'].values.sum()**2

var = sum_E_x_square - square_of_E_x_sum

std_dev = math.sqrt(var)

print('Standard Deviation is: ' + str(std_dev))

当 x 和 P(x) 已知时，如何计算 Python 中的标准偏差

问题描述

2 个解决方案

解决方案1
2 已采纳

解决方案2
0 2021-05-15 09:29:12

当 x 和 P(x) 已知时，如何计算 Python 中的标准偏差

问题描述

2 个解决方案

解决方案1 2 已采纳

解决方案2 0 2021-05-15 09:29:12

解决方案1
2 已采纳

解决方案2
0 2021-05-15 09:29:12