[英]Pandas multiindex dataframe get top 5 row of each sorted group
Assuming you have the following DF: 假设您具有以下DF:
In [97]: df
Out[97]:
Time
waller poster
1 11 2
22 3
33 1
44 1
55 1
2 33 1
3 11 1
22 1
33 1
44 2
55 1
66 3
Solution: 解:
In [98]: (df.sort_index(ascending=[1,0])
...: .groupby(level=0, as_index=False)
...: .apply(lambda x: x.head(5) if len(x) >= 5 else x.head(0))
...: .reset_index(level=0, drop=True)
...: )
...:
Out[98]:
Time
waller poster
1 55 1
44 1
33 1
22 3
11 2
3 66 3
55 1
44 2
33 1
22 1
To sort the poster column you can use sort level 要对发布者列进行排序,可以使用排序级别
df.sortlevel(1, ascending=False)
To get the top n results you can use .head 要获得前n个结果,可以使用.head
df.head(5)
To drop records you can reference the respective level: 要删除记录,您可以参考相应的级别:
df = df[df.index.levels[1] > 5]
Let me know if this helps. 让我知道是否有帮助。 Its hard to say if this will answer your problem with the limited information
很难说这能否在有限的信息下解决您的问题
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.