[英]pandas how to sort groupby by group sizes while aggregating on another column
I have the following df
, 我有以下df
,
id amount
1 20
2 8
1 3
1 2
2 7
I want to groupby
the df
by id
, and sorting the groups by their sizes, 我想groupby
的df
的id
,并通过它们的大小进行排序的群体,
df.groupby('id').size().sort_values(ascending=False)
but also aggregate on amount
of each group to create a separate column total
at the same time, 而且聚集的amount
每组创建一个单独的塔total
在同一时间,
id amount total size
1 20 25 3
1 3 25 3
1 2 25 3
2 8 15 2
2 7 15 2
You can use GroupBy
+ agg
with a list, followed by pd.merge
: 您可以在列表中使用GroupBy
+ agg
,然后使用pd.merge
:
g = df.groupby('id')['amount'].agg(['size', 'sum'])
res = pd.merge(df, g, left_on='id', right_index=True)\
.sort_values('size', ascending=False)
print(res)
id amount size sum
0 1 20 3 25
2 1 3 3 25
3 1 2 3 25
1 2 8 2 15
4 2 7 2 15
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.