[英]Pandas: sort dataframe on the basis of number of rows for the column value
I have a dataframe like this: 我有一个这样的数据框:
a b
1 2
3 2
2 3
6 3
7 3
5 4
I want to sort this dataframe on the basis of number of rows for values of b output : 我想根据b输出值的行数对该数据帧进行排序:
a b
2 3
6 3
7 3
1 2
3 2
5 4
any possible one liner for this ? 有什么可能的衬板吗?
You can sort a temporary column (actually, a DataFrame with a single column, since sorting Series can cause some stable-ness problem) created based on value counts, and index the original DataFrame on the result: 您可以对基于值计数创建的临时列(实际上是具有单个列的DataFrame进行排序,因为对Series进行排序可能会导致某些稳定性问题),并在结果上索引原始DataFrame:
print df.loc[df[['b']].replace(df.b.value_counts().to_dict()).sort('b', ascending=False).index]
Output: 输出:
a b
2 2 3
3 6 3
4 7 3
0 1 2
1 3 2
5 5 4
You can use groupby: 您可以使用groupby:
import pandas as pd
df = pd.DataFrame({'a':[1,3,2,6,7,5], 'b':[2,2,3,3,3,4]})
df.ix[df.groupby('b')[['b']].transform(len).sort('b', ascending=[0]).index]
a b
2 2 3
3 6 3
4 7 3
0 1 2
1 3 2
5 5 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.