简体   繁体   中英

Pandas: sort dataframe on the basis of number of rows for the column value

I have a dataframe like this:

a    b
1    2
3    2
2    3
6    3
7    3
5    4

I want to sort this dataframe on the basis of number of rows for values of b output :

a    b
2    3
6    3
7    3
1    2
3    2
5    4

any possible one liner for this ?

You can sort a temporary column (actually, a DataFrame with a single column, since sorting Series can cause some stable-ness problem) created based on value counts, and index the original DataFrame on the result:

print df.loc[df[['b']].replace(df.b.value_counts().to_dict()).sort('b', ascending=False).index]

Output:

   a  b
2  2  3
3  6  3
4  7  3
0  1  2
1  3  2
5  5  4

You can use groupby:

import pandas as pd    
df = pd.DataFrame({'a':[1,3,2,6,7,5], 'b':[2,2,3,3,3,4]})
df.ix[df.groupby('b')[['b']].transform(len).sort('b', ascending=[0]).index]

    a   b
2   2   3
3   6   3
4   7   3
0   1   2
1   3   2
5   5   4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM