[英]Different Standard Deviation in Pandas and Numpy
I was trying to calculate std
for an array, i've tried to use numpy
and pandas
in order to find std
, but what i achieved is not logical, i have two different std
's for the same array !我试图计算一个数组的std
,我尝试使用numpy
和pandas
来找到std
,但我所取得的不合逻辑,我有两个不同的std
用于同一个数组!
Why does this happens?为什么会发生这种情况?
>>> import numpy as np
>>> import pandas as pd
>>> a = np.arange(10)+1
>>> a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> a.std()
2.8722813232690143
>>> b = pd.DataFrame(a)
>>> b.std()
0 3.02765
dtype: float64
Difference is in degree of freedom, default in numpy is ddof=0
, in pandas is ddof=1
:差异在于自由度,numpy 中的默认值为ddof=0
,pandas 中的默认值为ddof=1
:
print(a.std())
2.8722813232690143
print(a.std(ddof=0))
2.8722813232690143
print(a.std(ddof=1))
3.0276503540974917
b = pd.DataFrame(a)
print(b.std())
0 3.02765
dtype: float64
print(b.std(ddof=1))
0 3.02765
dtype: float64
print(b.std(ddof=0))
0 2.872281
dtype: float64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.