[英]Calculate the average of the rows for each group
I need to calculate the mean of a certain column in DataFrame, so that means for each row is calculated excluding the previous values of the row for which it's calculated in certain group. 我需要计算DataFrame中某一列的均值,以便计算出的每一行均不包括在特定组中为其计算的行的先前值。 Lets assume we have this dataframe, this is the expected output
假设我们有这个数据框,这是预期的输出
is there any way that like iterate each row by index, adding previous row by index in every iteration, and then calculating mean. 有什么方法可以像按索引迭代每一行,在每次迭代中按索引添加前一行,然后计算均值。 I wonder if there's a more efficient way of doing it
我想知道是否有更有效的方法
unit A Expected
T10 8 8
T10 7 7.5
T10 12 9
T11 10 10
T11 6 8
T12 17 17
T12 7 12
T12 3 9
Divide DataFrameGroupBy.cumsum
with counter by GroupBy.cumcount
: 除以
DataFrameGroupBy.cumsum
由计数器GroupBy.cumcount
:
g = df.groupby('unit')['A']
df['Expected'] = g.cumsum().div(g.cumcount() + 1)
print (df)
unit A Expected
0 T10 8 8.0
1 T10 7 7.5
2 T10 12 9.0
3 T11 10 10.0
4 T11 6 8.0
5 T12 17 17.0
6 T12 7 12.0
7 T12 3 9.0
To calculate the mean of a particular column in pandas all you need to do is use the mean method of the pandas library. 要计算熊猫中特定列的均值,您需要做的就是使用熊猫库的均值方法。
mean = df["frequencies"].mean()
where df is the name of the dataframe and frequencies is the column you wish to find the mean of 其中df是数据帧的名称,而frequency是您要查找其均值的列
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.