[英]Pandas groupby aggregation
Let's say we have a pandas dataframe like the one below. 假设我们有一个像下面这样的熊猫数据框。
> category level score
> Bus travel 0.75
> Bus travel 0.60
> Bus vehicles 0.50
What I want is to group by the 'level' and calculate the 'count' and the maximum score for each 'level'. 我想要的是按“级别”分组并计算“计数”和每个“级别”的最高分数。 Also the 'hard' part is to create an output like this: 同样,“困难”部分是创建这样的输出:
> category travel score vehicles score
> Bus 2 0.75 1 0.5
I have been trying doing this: 我一直在尝试这样做:
> grouped = df.groupby('level').agg(
{
'category': 'count',
'score': 'max'
})
Any ideas? 有任何想法吗?
from StringIO import StringIO
import pandas as pd
text = """category level score
Bus travel 0.75
Bus travel 0.60
Bus vehicles 0.50"""
df = pd.read_csv(StringIO(text), delim_whitespace=1)
print df
category level score
0 Bus travel 0.75
1 Bus travel 0.60
2 Bus vehicles 0.50
gdf = df.groupby('category').apply(
lambda df: df.groupby('level')['score'].agg({'count', 'max'})).unstack()
gdf.columns = gdf.columns.swaplevel(0, 1)
gdf = gdf.sort_index(axis=1)
print gdf
level travel vehicles
count max count max
category
Bus 2 0.75 1 0.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.