Sort by both index and value in Multi-indexed data of Pandas dataframe

Question

Suppose, I have a dataframe as below:

    year    month   message
0   2018    2   txt1
1   2017    4   txt2
2   2019    5   txt3
3   2017    5   txt5
4   2017    5   txt4
5   2020    4   txt3
6   2020    6   txt3
7   2020    6   txt3
8   2020    6   txt4

I want to figure out top three number of messages in each year. So, I grouped the data as below:

df.groupby(['year','month']).count()

which results:

            message
year    month   
2017    4   1
        5   2
2018    2   1
2019    5   1
2020    4   1
        6   3

The data is in ascending order for both indexes. But how to find the results as shown below where the data is sorted by year (ascending) and count (descending) for top n values. 'month' index will be free.

            message
year    month   
2017    5   2
        4   1
2018    2   1
2019    5   1
2020    6   3
        4   1

Answer 1

这将按年份（升序）和计数（降序）排序。

df = df.groupby(['year', 'month']).count().sort_values(['year', 'message'], ascending=[True, False])

Answer 2

value_counts gives you sort by default:

df.groupby('year')['month'].value_counts()

Output:

year  month
2017  5        2
      4        1
2018  2        1
2019  5        1
2020  6        3
      4        1
Name: month, dtype: int64

If you want only 2 top values for each year, do another groupby:

(df.groupby('year')['month'].value_counts()
   .groupby('year').head(2)
)

Output:

year  month
2017  5        2
      4        1
2018  2        1
2019  5        1
2020  6        3
      4        1
Name: month, dtype: int64

Answer 3

You can use sort_index , specifying ascending=[True,False] so that only the second level is sorted in descending order:

df = df.groupby(['year','month']).count().sort_index(ascending=[True,False])

              message
year month         
2017 5            2
     4            1
2018 2            1
2019 5            1
2020 6            3
     4            1

Answer 4

干得好

df.groupby(['year', 'month']).count().sort_values(axis=0, ascending=False, by='message').sort_values(axis=0, ascending=True, by='year')

Answer 5

您可以使用此代码。

df.groupby(['year', 'month']).count().sort_index(axis=0, ascending=False).sort_values(by="year", ascending=True)

Sort by both index and value in Multi-indexed data of Pandas dataframe

Question

5 answers

solution1
2 2020-03-09 12:41:46

solution2
2 ACCPTED 2020-03-09 12:55:53

solution3
1 2020-03-09 11:59:31

solution4
1 2020-03-09 12:40:23

solution5
0 2020-03-09 12:11:14

Sort by both index and value in Multi-indexed data of Pandas dataframe

Question

5 answers

solution1 2 2020-03-09 12:41:46

solution2 2 ACCPTED 2020-03-09 12:55:53

solution3 1 2020-03-09 11:59:31

solution4 1 2020-03-09 12:40:23

solution5 0 2020-03-09 12:11:14

solution1
2 2020-03-09 12:41:46

solution2
2 ACCPTED 2020-03-09 12:55:53

solution3
1 2020-03-09 11:59:31

solution4
1 2020-03-09 12:40:23

solution5
0 2020-03-09 12:11:14