如何使用nlargest分组并保留所有列？

Question

I want to groupby DataFrame and get the nlargest data of column 'C'. 我想对DataFrame进行分组，并获取列“ C”的最大数据。 while the return is series, not DataFrame. 而返回的是系列，而不是DataFrame。

dftest = pd.DataFrame({'A':[1,2,3,4,5,6,7,8,9,10],
                       'B':['A','B','A','B','A','B','A','B','B','B'],
                       'C':[0,0,1,1,2,2,3,3,4,4]})
dfn=dftest.groupby('B',group_keys=False)\
            .apply(lambda grp:grp['C'].nlargest(int(grp['C'].count()*0.8))).sort_index()

the result get a series. 结果得到一系列。

2    1
4    2
5    2
6    3
7    3
8    4
9    4
Name: C, dtype: int64

I hope the result is DataFrame, like 我希望结果是DataFrame，就像

    A  B  C
2   3  A  1
4   5  A  2
5   6  B  2
6   7  A  3
7   8  B  3
8   9  B  4
9  10  B  4

******update************** sorry, the column 'A' in fact does not series integers, the dftest might be more like ****** update **************抱歉，'A'列实际上不是整数序列，dftest可能更像

dftest = pd.DataFrame({'A':['Feb','Flow','Air','Flow','Feb','Beta','Cat','Feb','Beta','Air'],
                       'B':['A','B','A','B','A','B','A','B','B','B'],
                       'C':[0,0,1,1,2,2,3,3,4,4]})

and the result should be 结果应该是

    A     B  C
2   Air   A  1
4   Feb   A  2
5   Beta  B  2
6   Cat   A  3
7   Feb   B  3
8   Beta  B  4
9   Air   B  4

Answer 1

It may be a bit clumsy, but it does what you asked: 它可能有点笨拙，但是可以满足您的要求：

dfn= dftest.groupby('B').apply(lambda 
grp:grp['C'].nlargest(int(grp['C'].count()*0.8))).reset_index().rename(columns= 
{'level_1':'A'})
dfn.A = dfn.A+1
dfn=dfn[['A','B','C']].sort_values(by='A')

Answer 2

Thanks to my friends, the follow code works for me. 多亏了我的朋友，以下代码对我有用。

dfn=dftest.groupby('B',group_keys=False)\
            .apply(lambda grp:grp.nlargest(n=int(grp['C'].count()*0.8),columns='C').sort_index())

the dfn is dfn是

In [8]:dfn
Out[8]: 
    A  B  C
2   3  A  1
4   5  A  2
6   7  A  3
5   6  B  2
7   8  B  3
8   9  B  4
9  10  B  4

my previous code is deal with series, the later one is deal with DataFrame. 我以前的代码是处理系列，后面的代码是处理DataFrame。

如何使用nlargest分组并保留所有列？

问题描述

2 个解决方案

解决方案1
0 2019-05-11 15:08:58

解决方案2
0 已采纳 2019-05-12 08:03:17

如何使用nlargest分组并保留所有列？

问题描述

2 个解决方案

解决方案1 0 2019-05-11 15:08:58

解决方案2 0 已采纳 2019-05-12 08:03:17

解决方案1
0 2019-05-11 15:08:58

解决方案2
0 已采纳 2019-05-12 08:03:17