简体   繁体   中英

How to calculate day max for time series along with relevant hour?

I have a dataframe with some timeseries inside. It has 3 columns: day, hour and value:

day hour value
12-Jan 11-00 14
12-Jan 12-00 100
12-Jan 13-00 345
12-Jan 14-00 195
13-Jan 12-00 76
13-Jan 13-00 221
13-Jan 14-00 102
13-Jan 15-00 395

As you see max value for 12 Jan is observed at 13-00.

I want to calculate max of the "value" for each "day". I can do it via simple

df.groupby("day")["value"].max()

It works, but after grouping apparently we have hour information erased. The question is: how can I build dataFrame which would contain day max value along with the hour when that value was observed , ie

day hour when maxValue was observed maxValue
12-Jan 13-00 345
13-Jan 15-00 395

?

EDIT

I created a sample of your df :

         day   hour  value
0 2021-01-12  11-00     14
1 2021-01-12  12-00    100
2 2021-01-12  13-00    345
3 2021-01-12  14-00    195
4 2021-01-13  12-00     76
5 2021-01-13  13-00    221
6 2021-01-13  14-00    102
7 2021-01-13  15-00    395

And run this code on it:

res = pd.merge(df.groupby('day').agg({'value':'max'},as_index=False).add_prefix('max_'),df,how='left',left_on='max_value',right_on='value')

And got back:

   max_value        day   hour  value
0        345 2021-01-12  13-00    345
1        395 2021-01-13  15-00    395

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM