[英]Standard deviation and mean of complete pandas dataframe
I've got quite a large dataset and would like to calculate the mean and the standard deviation, across all columns and rows.我有一个相当大的数据集,想计算所有列和行的平均值和标准差。 Unfortunately, I haven't got a proper solution for this yet.不幸的是,我还没有为此找到合适的解决方案。 My dataset looks a littlebit like that (total of 600 rows):我的数据集看起来有点像(总共 600 行):
When I use the pandas-function weekl_rtr.mean()
I just calculate the mean across each column.当我使用 pandas 函数weekl_rtr.mean()
我只是计算每列的平均值。 The workaround, that might work for the mean should be weekl_rtr.mean().mean()
, but this does not work for the standard deviation.可能适用于平均值的解决方法应该是weekl_rtr.mean().mean()
,但这不适用于标准偏差。 Do you have an idea, how to solve this?你有什么想法,如何解决这个问题?
Thank you and kind regards,谢谢和亲切的问候,
Markus马库斯
To my knowledge, there is no direct way to do it in pandas.据我所知,在熊猫中没有直接的方法。 You have two options:您有两个选择:
df.values.mean()
or df.to_numpy().mean()
in pandas 0.24+.例如,您可以在df.values.mean()
中执行df.values.mean()
或df.to_numpy().mean()
。Change the axis for mean and standard deviation:更改平均值和标准偏差的轴:
# Across columns (Default)
weekl_rtr.mean(axis = 0)
# or
weekl_rtr.mean()
# Across rows
weekl_rtr.mean(axis = 1)
The same applies to std()
.这同样适用于std()
。 You should also look into df.describe()
which describes a DataFrame with more statistics info (mean, std, count, min, max, percentiles):您还应该查看df.describe()
,它描述了具有更多统计信息(平均值、标准、计数、最小值、最大值、百分位数)的 DataFrame:
# Across columns
weekl_rtr.describe()
# Across rows
weekl_rtr.apply(pd.DataFrame.describe, axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.