I have merged test data as below:
Device time Key score
Computers 2018-01-01 14.0 4.0
Computers 2018-01-01 11.0 4.0
Computers 2018-01-01 16.0 0.0
I need to group data by columns [Device,time] and by max value from column score and get minimum key value assigned to this score.
My 1 atempt:
df_out = df_out.groupby(['Device', 'time'])['score'].max().reset_index()
Output 1:
Device time score
Computers 2018-01-01 4.0
My 2 atempt:
df_out = df_out.groupby(['Device', 'time'])['score', 'Key'].max().reset_index()
Output 2:
Device time score Key
Computers 2018-01-01 4.0 14.0
How to get assigned proper minimum Key?
Desired output:
Device time score Key
Computers 2018-01-01 4.0 11.0
Thanks for You hard work.
You can use transform
:
df[df.score.eq(df.groupby(['Device', 'time'])['score'].transform('max'))]
Device time Key score
0 Computers 2018-01-01 14.0 4.0
As per EDIT:
df.groupby(['Device', 'time'],as_index=False).agg({'score':'max','Key':'min'})
Device time score Key
0 Computers 2018-01-01 4.0 11.0
Using apply
and custom function to get the desired row with loc
:
def selecting(x):
subx = x.loc[x['score'] == x['score'].max()]
return subx.loc[subx['Key'].idxmin()]
ddf = df.groupby(['Device', 'time']).apply(selecting)
Using your sample input, this will give:
1 Device time Key score
Device time
Computers 2018-01-01 Computers 2018-01-01 11.0 4.0
You can drop the multi-index using .reset_index(drop=True)
on the result.
I edited the answer using a custom function, to perform correctly the selection. I realized that the previous version of my answer may raise a KeyError
on more complex dataframes.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.