简体   繁体   English

Python Pandas 按 mean() 分组一定数量的行

[英]Python Pandas group by mean() for a certain count of rows

I need to group by mean() for the first 2 values of each category, how I define that.我需要按 mean() 对每个类别的前 2 个值进行分组,我是如何定义的。 df like喜欢

category    value
-> a    2
-> a    5
a   4
a   8
-> b    6
-> b    3
b   1
-> c    2
-> c    2
c   7

by reading only the arrowed data where the output be like通过仅读取 output 的箭头数据

category    mean
a   3.5
b   4.5
c   2

how can I do this I am trying but do not know where to define the to get only 1st 2 observation from each categrory我该怎么做我正在尝试但不知道在哪里定义才能从每个类别中仅获得 1st 2 观察

output = df.groupby(['category'])['value'].mean().reset_index()

your help is appreciated, thanks in advance感谢您的帮助,在此先感谢

Try apply on each group of values and use head(2) to just get the first 2 values then mean :尝试apply每组值并使用head(2)来获取前 2 个值,然后mean

import pandas as pd

df = pd.DataFrame({
    'category': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'b', 5: 'b',
                 6: 'b', 7: 'c', 8: 'c', 9: 'c'},
    'value': {0: 2, 1: 5, 2: 4, 3: 8, 4: 6, 5: 3, 6: 1, 7: 2,
              8: 2, 9: 7}
})

output = df.groupby('category', as_index=False)['value'] \
    .apply(lambda a: a.head(2).mean())

print(output)

output : output

  category  value
0        a    3.5
1        b    4.5
2        c    2.0

Or create a boolean index to filter df with:或者创建一个 boolean 索引来过滤 df :

m = df.groupby('category').cumcount().lt(2)
output = df[m].groupby('category')['value'].mean().reset_index()
print(output)
  category  value
0        a    3.5
1        b    4.5
2        c    2.0

You can also do this via groupby() and agg() :您也可以通过groupby()agg()执行此操作:

out=df.groupby('category',as_index=False)['value'].agg(lambda x:x.head(2).mean())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM