[英]Calculate the 3rd standard deviation for an array
Say, I have an array:说,我有一个数组:
import numpy as np
x = np.array([0, 1, 2, 5, 6, 7, 8, 8, 8, 10, 29, 32, 45])
How can I calculate the 3rd standard deviation for it, so I could get the value of +3sigma
as shown on the picture below?我如何计算它的第 3 个标准偏差,这样我就可以得到
+3sigma
的值,如下图所示?
Typically, I use std = np.std(x)
, but to be honest, I don't know if it returns the 1sigma
value or maybe 2sigma
, or whatever.通常,我使用
std = np.std(x)
,但老实说,我不知道它返回的是1sigma
值还是2sigma
,或者其他什么。 I'll very grateful for you help.我会非常感谢你的帮助。 Thank you in advance.
先感谢您。
NumPy's std
yields the standard deviation, which is usually denoted with "sigma". NumPy的
std
产生标准偏差,通常用“sigma”表示。 To get the 2-sigma or 3-sigma ranges, you can simply multiply sigma with 2 or 3: 要获得2-sigma或3-sigma范围,您可以简单地将sigma乘以2或3:
print [x.mean() - 3 * x.std(), x.mean() + 3 * x.std()]
Output: 输出:
[-27.545797458510656, 52.315028227741429]
For more detailed information, you might refer to the documentation, which states: 有关更多详细信息,请参阅文档,其中说明:
The standard deviation is the square root of the average of the squared deviations from the mean, ie, std = sqrt(mean(abs(x - x.mean())**2)).
标准偏差是平均偏差平均值的平方根,即std = sqrt(平均值(abs(x-x.mean())** 2))。
http://docs.scipy.org/doc/numpy/reference/generated/numpy.std.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.std.html
For anyone who stumbles on this, use something like this - which gives you a rounded column of which standard deviation based on some other column's values:对于任何偶然发现此问题的人,请使用类似这样的东西 - 它会为您提供一个圆角列,其中标准偏差基于其他列的值:
# get percentile and which standard deviation for daily decline pct change
def which_std_dev(row,df,col):
std_1 = round(df[col].mean() + 1 * df[col].std(),0)
std_2 = round(df[col].mean() + 2 * df[col].std(),0)
std_3 = round(df[col].mean() + 3 * df[col].std(),0)
std_4 = round(df[col].mean() + 4 * df[col].std(),0)
std_5 = round(df[col].mean() + 5 * df[col].std(),0)
std_6 = round(df[col].mean() + 6 * df[col].std(),0)
std_7 = round(df[col].mean() + 7 * df[col].std(),0)
std_8 = round(df[col].mean() + 8 * df[col].std(),0)
std_9 = round(df[col].mean() + 9 * df[col].std(),0)
std_10 = round(df[col].mean() + 10 * df[col].std(),0)
if row[col] <= std_1:
return 1
elif row[col] > std_1 and row[col] < std_2:
return 2
elif row[col] >= std_2 and row[col] < std_3:
return 3
elif row[col] >= std_3 and row[col] < std_4:
return 4
elif row[col] >= std_4 and row[col] < std_5:
return 5
elif row[col] >= std_6 and row[col] < std_6:
return 6
elif row[col] >= std_7 and row[col] < std_7:
return 7
elif row[col] >= std_8 and row[col] < std_8:
return 8
elif row[col] >= std_9 and row[col] < std_9:
return 9
else:
return 10
df_day['percentile'] = round(df_day['daily_decline_pct_change'].rank(pct=True),3)
df_day['which_std_dev'] = df_day.apply(lambda row: which_std_dev(row,df_day,'daily_decline_pct_change'), axis = 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.