[英]Python Pandas group by mean() for a certain count of rows
I need to group by mean() for the first 2 values of each category, how I define that.我需要按 mean() 对每个类别的前 2 个值进行分组,我是如何定义的。 df like
喜欢
category value
-> a 2
-> a 5
a 4
a 8
-> b 6
-> b 3
b 1
-> c 2
-> c 2
c 7
by reading only the arrowed data where the output be like通过仅读取 output 的箭头数据
category mean
a 3.5
b 4.5
c 2
how can I do this I am trying but do not know where to define the to get only 1st 2 observation from each categrory我该怎么做我正在尝试但不知道在哪里定义才能从每个类别中仅获得 1st 2 观察
output = df.groupby(['category'])['value'].mean().reset_index()
your help is appreciated, thanks in advance感谢您的帮助,在此先感谢
Try apply
on each group of values and use head(2)
to just get the first 2 values then mean
:尝试
apply
每组值并使用head(2)
来获取前 2 个值,然后mean
:
import pandas as pd
df = pd.DataFrame({
'category': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'b', 5: 'b',
6: 'b', 7: 'c', 8: 'c', 9: 'c'},
'value': {0: 2, 1: 5, 2: 4, 3: 8, 4: 6, 5: 3, 6: 1, 7: 2,
8: 2, 9: 7}
})
output = df.groupby('category', as_index=False)['value'] \
.apply(lambda a: a.head(2).mean())
print(output)
output
: output
:
category value
0 a 3.5
1 b 4.5
2 c 2.0
Or create a boolean index to filter df with:或者创建一个 boolean 索引来过滤 df :
m = df.groupby('category').cumcount().lt(2)
output = df[m].groupby('category')['value'].mean().reset_index()
print(output)
category value
0 a 3.5
1 b 4.5
2 c 2.0
You can also do this via groupby()
and agg()
:您也可以通过
groupby()
和agg()
执行此操作:
out=df.groupby('category',as_index=False)['value'].agg(lambda x:x.head(2).mean())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.