简体   繁体   English

如何重新采样日内间隔并使用.idxmax()?

[英]How to resample intra-day intervals and use .idxmax()?

I am using data from yfinance which returns a pandas Data-Frame.我正在使用来自 yfinance 的数据,它返回 pandas 数据帧。

                            Volume
Datetime                          
2021-09-13 09:30:00-04:00   951104
2021-09-13 09:35:00-04:00   408357
2021-09-13 09:40:00-04:00   498055
2021-09-13 09:45:00-04:00   466363
2021-09-13 09:50:00-04:00   315385
2021-12-06 15:35:00-05:00   200748
2021-12-06 15:40:00-05:00   336136
2021-12-06 15:45:00-05:00   473106
2021-12-06 15:50:00-05:00   705082
2021-12-06 15:55:00-05:00  1249763

There are 5 minute intra-day intervals in the data-frame.数据框中有 5 分钟的日内间隔。 I want to resample to daily data and get the idxmax of the maximum volume for that day.我想重新采样到每日数据并获得当天最大音量的 idxmax。

df.resample("B")["Volume"].idxmax()

Returns an error:返回错误:

ValueError: attempt to get argmax of an empty sequence

I used B(business-days) as the resampling period, so there shouldn't be any empty sequences.我使用 B(business-days) 作为重采样周期,所以不应该有任何空序列。

I should say.max() works fine.我应该说.max() 工作正常。

Also using.agg as was suggested in another question returns an error:同样使用另一个问题中建议的 using.agg 会返回错误:

df["Volume"].resample("B").agg(lambda x : np.nan if x.count() == 0 else x.idxmax()) 

error:错误:

IndexError: index 77 is out of bounds for axis 0 with size 0

For me working test if all NaN s per group in if-else :对我来说,如果在if-else中每个组的所有NaN都可以工作测试:

df = df.resample("B")["Volume"].agg(lambda x: np.nan if x.isna().all() else x.idxmax())

You can use groupby as an alternative of resample :您可以使用groupby作为resample的替代品:

>>> df.groupby(df.index.normalize())['Volume'].agg(Datetime='idxmax', Volume='max')

                      Datetime   Volume
Datetime                               
2021-09-13 2021-09-13 09:30:00   951104
2021-12-06 2021-12-06 15:55:00  1249763

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM