[英]DataFrame : Get the top n value of each type
I have a group of data like below 我有一组如下数据
ID Type value_1 value_2
1 A 12 89
2 A 13 78
3 A 11 92
4 A 9 79
5 B 15 83
6 B 34 91
7 B 2 87
8 B 3 86
9 B 7 85
10 C 9 83
11 C 3 85
12 C 2 87
13 C 12 88
14 C 11 82
I want to get the top 3 member of each Type according to the value_1
. 我想根据
value_1
获取每个Type的前3名成员。 The only solution occurs to me is that: first , get each Type data into a dataframe and sorted according to the value_1
and get the top 3; 我遇到的唯一解决方案是:首先,将每个Type数据放入一个数据帧,然后根据
value_1
进行排序,得到前3个; Then, merge the result together. 然后,将结果合并在一起。 But is ther any simple method to solve it ?
但是,有任何简单的方法可以解决它吗? For easy discuss , I have codes below
为便于讨论,我的代码如下
#coding:utf-8
import pandas as pd
_data = [
["1","A",12,89],
["2","A",13,78],
["3","A",11,92],
["4","A",9,79],
["5","B",15,83],
["6","B",34,91],
["7","B",2,87],
["8","B",3,86],
["9","B",7,85],
["10","C",9,83],
["11","C",3,85],
["12","C",2,87],
["13","C",12,88],
["14","C",11,82]
]
head= ["ID","type","value_1","value_2"]
df = pd.DataFrame(_data, columns=head)
Then we using groupby
tail
with sort_values
然后我们使用
groupby
tail
和sort_values
newdf=df.sort_values(['type','value_1']).groupby('type').tail(3)
newer
ID type value_1 value_2
2 3 A 11 92
0 1 A 12 89
1 2 A 13 78
8 9 B 7 85
4 5 B 15 83
5 6 B 34 91
9 10 C 9 83
13 14 C 11 82
12 13 C 12 88
Sure! 当然!
DataFrame.groupby
can split a dataframe into different parts by the group fields and apply
function can apply UDF on each group. DataFrame.groupby
可以通过组字段apply
数据帧拆分为不同的部分, apply
函数可以在每个组上应用UDF。
df.groupby('type', as_index=False, group_keys=False)\
.apply(lambda x: x.sort_values('value_1', ascending=False).head(3))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.