简体   繁体   中英

Pandas Keep top N values each group and set others to 0

I have the following df, which is adapted from here

import pandas as pd    
df = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[1,2,3,1,2,3,4,3,2]})

I would like to have the following result:

df1 = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[0,2,3,0,2,3,4,3,0]})

The logic is (1) to select top 2 numbers for each group and (2) to set other values to 0.

Any suggestions? Thanks.

Use nlargest

df.loc[~df.index.isin(df.groupby('group')['value']\
                        .apply(lambda s: s.nlargest(2))\
                        .index\
                        .get_level_values(1)), 'value'] = 0

   group  value
0      1      0
1      1      2
2      1      3
3      2      0
4      2      2
5      2      3
6      3      4
7      3      3
8      3      0

Using nlargest per group for boolean indexing:

m = df.groupby('group')['value'].apply(lambda g: g.isin(g.nlargest(2)))

df['value'] = df['value'].where(m, 0)

Output (as new column "value2" for clarity)

   group  value  value2
0      1      1       0
1      1      2       2
2      1      3       3
3      2      1       0
4      2      2       2
5      2      3       3
6      3      4       4
7      3      3       3
8      3      2       0

You can use a combination of where() and cumcount() to get the results you are expecting

df = pd.DataFrame({'group':[1,1,1,2,2,2,3,3,3],
               'value':[1,2,3,1,2,3,4,3,2]})
df.where(df.sort_values('value', ascending = False).groupby('group')['value'].cumcount() + 1 <= 2, 0)
def function1(dd:pd.DataFrame):
    return dd.assign(value=np.where(dd.index.isin(dd.nlargest(2,'value').index),dd.value,0))

df.groupby('group').apply(function1)

out

   group  value
0      1      0
1      1      2
2      1      3
3      2      0
4      2      2
5      2      3
6      3      4
7      3      3
8      3      0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM