简体   繁体   English

计算每组的平均行数

[英]Calculate the average of the rows for each group

I need to calculate the mean of a certain column in DataFrame, so that means for each row is calculated excluding the previous values of the row for which it's calculated in certain group. 我需要计算DataFrame中某一列的均值,以便计算出的每一行均不包括在特定组中为其计算的行的先前值。 Lets assume we have this dataframe, this is the expected output 假设我们有这个数据框,这是预期的输出

is there any way that like iterate each row by index, adding previous row by index in every iteration, and then calculating mean. 有什么方法可以像按索引迭代每一行,在每次迭代中按索引添加前一行,然后计算均值。 I wonder if there's a more efficient way of doing it 我想知道是否有更有效的方法

unit    A      Expected 
T10     8      8
T10     7      7.5
T10     12     9
T11     10     10
T11     6      8
T12     17     17
T12     7      12
T12     3      9

You can use expanding : 您可以使用expanding

df2 = df.groupby('unit')['A'].expanding().mean().reset_index()
df['Expected'] = df2['A']

Divide DataFrameGroupBy.cumsum with counter by GroupBy.cumcount : 除以DataFrameGroupBy.cumsum由计数器GroupBy.cumcount

g = df.groupby('unit')['A']
df['Expected'] = g.cumsum().div(g.cumcount() + 1)
print (df)
  unit   A  Expected
0  T10   8       8.0
1  T10   7       7.5
2  T10  12       9.0
3  T11  10      10.0
4  T11   6       8.0
5  T12  17      17.0
6  T12   7      12.0
7  T12   3       9.0

To calculate the mean of a particular column in pandas all you need to do is use the mean method of the pandas library. 要计算熊猫中特定列的均值,您需要做的就是使用熊猫库的均值方法。

mean = df["frequencies"].mean()

where df is the name of the dataframe and frequencies is the column you wish to find the mean of 其中df是数据帧的名称,而frequency是您要查找其均值的列

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM