[英]How to calculate day max for time series along with relevant hour?
I have a dataframe with some timeseries inside.我有一个 dataframe 里面有一些时间序列。 It has 3 columns: day, hour and value:
它有 3 列:日、小时和值:
day![]() |
hour![]() |
value![]() |
---|---|---|
12-Jan ![]() |
11-00 ![]() |
14 ![]() |
12-Jan ![]() |
12-00 ![]() |
100 ![]() |
12-Jan ![]() |
13-00 ![]() |
345 ![]() |
12-Jan ![]() |
14-00 ![]() |
195 ![]() |
13-Jan ![]() |
12-00 ![]() |
76 ![]() |
13-Jan ![]() |
13-00 ![]() |
221 ![]() |
13-Jan ![]() |
14-00 ![]() |
102 ![]() |
13-Jan ![]() |
15-00 ![]() |
395 ![]() |
As you see max value for 12 Jan is observed at 13-00.如您所见,在 13-00 观察到 1 月 12 日的最大值。
I want to calculate max of the "value" for each "day".我想计算每个“天”的“价值”的最大值。 I can do it via simple
我可以通过简单的方式做到这一点
df.groupby("day")["value"].max()
It works, but after grouping apparently we have hour information erased.它有效,但显然在分组后我们删除了小时信息。 The question is: how can I build dataFrame which would contain day max value along with the hour when that value was observed , ie
问题是:我如何构建 dataFrame 它将包含天最大值以及观察到该值的小时,即
day![]() |
hour when maxValue was observed![]() |
maxValue![]() |
---|---|---|
12-Jan ![]() |
13-00 ![]() |
345 ![]() |
13-Jan ![]() |
15-00 ![]() |
395 ![]() |
? ?
EDIT编辑
I created a sample of your df
:我创建了您的
df
样本:
day hour value
0 2021-01-12 11-00 14
1 2021-01-12 12-00 100
2 2021-01-12 13-00 345
3 2021-01-12 14-00 195
4 2021-01-13 12-00 76
5 2021-01-13 13-00 221
6 2021-01-13 14-00 102
7 2021-01-13 15-00 395
And run this code on it:并在其上运行此代码:
res = pd.merge(df.groupby('day').agg({'value':'max'},as_index=False).add_prefix('max_'),df,how='left',left_on='max_value',right_on='value')
And got back:回来了:
max_value day hour value
0 345 2021-01-12 13-00 345
1 395 2021-01-13 15-00 395
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.