[英]Groupby multiple columns count size and calculate mean of another column in Pandas
I have a dataframe as follows:我有一个数据框如下:
index col1 col2 col3 col4 col5
0 0 Week_1 James John 1 when and why?
1 1 Week_1 James John 3 when and why? How?
2 2 Week_2 James John 2 How far is it? Are you going?
3 3 Week_2 Mark Jim 3 Do you know when?
4 4 Week_2 Andrew Simon 3 What time?
5 5 Week_2 Andrew Simon 6 What time?
How could I groupby col2
and col3
then calculate mean and count numbers of col2
and col3
?我怎么能GROUPBY
col2
和col3
然后计算平均和计数的数字col2
和col3
?
df.groupby(['col2','col3'], as_index=False).agg({'col4':'mean'}).reset_index()
Output:输出:
index col2 col3 col4
0 0 Andrew Simon 4.5
1 1 James John 2.0
2 2 Mark Jim 3.0
df.groupby(['col2','col3']).size().reset_index()
Output:输出:
col2 col3 0
0 Andrew Simon 2
1 James John 3
2 Mark Jim 1
How could I get result like this?我怎么能得到这样的结果? Thanks.
谢谢。
index col2 col3 mean count
0 0 James John 2.0 3
1 3 Mark Jim 3.0 1
2 4 Andrew Simon 4.5 2
You can use groupby and agg(may need to use pandas 0.25+).您可以使用 groupby 和 agg(可能需要使用 pandas 0.25+)。
(
df.groupby(['col2','col3'])
.agg(index=('index', 'first'),
mean=('col4', 'mean'),
count=('col4', 'size'))
.reset_index()
.sort_values(by='index')
)
col2 col3 index mean count
1 James John 0 2.0 3
2 Mark Jim 3 3.0 1
0 Andrew Simon 4 4.5 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.