简体   繁体   中英

How can I count the number of rows per group in Pandas?

I have a dataset with several Oscar winners. I have the following columns: Name of winner, award, place of birth, date of birth and year. I want to check how many rows are filled per year. Let's say for 2005 we have the winner of best director and best actor and for 2006 we have the winner for best supporting actor. I want to get something like this as the result:

year_of_award number of rows
2005 2
2006 1

It looks something so simple, but I can't get it right. Most posts I found would recommend the combination of group by with count(). However, when I write the code below, I get the number of rows for all columns. So I have the year and other 4 columns filled with the number of rows.

df.groupby(['year_of_award']).count() 

How can I get just the year and the number of rows?

Try for pandas 0.25+

df.groupby(['year_of_award']).agg(number_of_rows=('award': 'count'))

else

df.groupby(['year_of_award']).agg({'award': 'count'}).rename(columns={'count': 'number_of_rows'})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM