简体   繁体   English

完整熊猫数据框的标准偏差和平均值

[英]Standard deviation and mean of complete pandas dataframe

I've got quite a large dataset and would like to calculate the mean and the standard deviation, across all columns and rows.我有一个相当大的数据集,想计算所有列和行的平均值和标准差。 Unfortunately, I haven't got a proper solution for this yet.不幸的是,我还没有为此找到合适的解决方案。 My dataset looks a littlebit like that (total of 600 rows):我的数据集看起来有点像(总共 600 行):

<code>df.head()</code> 数据集

When I use the pandas-function weekl_rtr.mean() I just calculate the mean across each column.当我使用 pandas 函数weekl_rtr.mean()我只是计算每列的平均值。 The workaround, that might work for the mean should be weekl_rtr.mean().mean() , but this does not work for the standard deviation.可能适用于平均值的解决方法应该是weekl_rtr.mean().mean() ,但这不适用于标准偏差。 Do you have an idea, how to solve this?你有什么想法,如何解决这个问题?

Thank you and kind regards,谢谢和亲切的问候,

Markus马库斯

To my knowledge, there is no direct way to do it in pandas.据我所知,在熊猫中没有直接的方法。 You have two options:您有两个选择:

  1. Get the underlying numpy array and calculate mean or std on it.获取底层 numpy 数组并在其上计算均值或标准差。 In contrast to pandas this will evaluate the function across all dimentions by default.与熊猫相反,这将默认评估所有维度的函数。 For example, you can do df.values.mean() or df.to_numpy().mean() in pandas 0.24+.例如,您可以在df.values.mean()中执行df.values.mean()df.to_numpy().mean()
  2. Transform the table into a single column and then run the desired operation on that column将表转换为单个列,然后对该列运行所需的操作

Change the axis for mean and standard deviation:更改平均值和标准偏差的轴:

# Across columns (Default)
weekl_rtr.mean(axis = 0)
# or
weekl_rtr.mean()

# Across rows
weekl_rtr.mean(axis = 1)

The same applies to std() .这同样适用于std() You should also look into df.describe() which describes a DataFrame with more statistics info (mean, std, count, min, max, percentiles):您还应该查看df.describe() ,它描述了具有更多统计信息(平均值、标准、计数、最小值、最大值、百分位数)的 DataFrame:

# Across columns
weekl_rtr.describe()

# Across rows
weekl_rtr.apply(pd.DataFrame.describe, axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算 pandas dataframe 中的平均值和标准差 - Calculating mean and standard deviation in pandas dataframe 熊猫:将DataFrame转换为每个单元的均值和标准差 - Pandas: Convert DataFrame to Mean and Standard Deviation of Each Cell 绘制Pandas Dataframe的直方图及其平均值和标准偏差,得到ValueError - Plotting histogram of Pandas Dataframe with its mean and standard deviation, getting ValueError Pandas:计算整个 dataframe 的平均值或标准差(标准差) - Pandas : compute mean or std (standard deviation) over entire dataframe 熊猫系列的平均值和标准差 - Pandas series mean and standard deviation Pandas Dataframe分组和标准偏差 - Pandas Dataframe grouping and standard deviation 均值由数组的间隔,Python中的标准差(Pandas) - Mean by interval of an array, standard deviation in python (Pandas) Pandas 手动计算平均值或标准差 - Pandas calculate manually for mean or standard deviation 如何在没有当前行值的情况下对熊猫数据框进行分组并计算统计汇总(均值和标准差)? - How to group pandas dataframe and calculate statistical summary (mean and standard deviation) without current row value? 添加一个新的pandas数据框列,并在其中填充条件计算(平均值为,标准差为) - Adding a new pandas dataframe columns populating it with conditional calculations (mean if, standard deviation if)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM