简体   繁体   English

使用pandas数据帧中的1分钟数据计算每5分钟返回一次

[英]Calculate every 5 minute returns using 1 minute data in pandas dataframe

I have 1 minute price data as Python pandas dataframe like this: 我有1分钟的价格数据作为Python pandas数据帧如下:

          Date                Time      Open      High       Low     Close  
390 2004-04-13 1900-01-01 09:31:00  1146.210  1147.020  1146.210  1147.020   
391 2004-04-13 1900-01-01 09:32:00  1147.120  1147.339  1147.120  1147.219   
392 2004-04-13 1900-01-01 09:33:00  1147.100  1147.630  1147.100  1147.630   
393 2004-04-13 1900-01-01 09:34:00  1147.700  1147.700  1147.439  1147.469   
394 2004-04-13 1900-01-01 09:35:00  1147.560  1147.730  1147.560  1147.680   
395 2004-04-13 1900-01-01 09:36:00  1147.700  1147.700  1147.640  1147.640   
396 2004-04-13 1900-01-01 09:37:00  1147.810  1147.810  1147.430  1147.430   
397 2004-04-13 1900-01-01 09:38:00  1147.310  1147.310  1147.110  1147.110   
398 2004-04-13 1900-01-01 09:39:00  1147.050  1147.050  1146.870  1146.870   
399 2004-04-13 1900-01-01 09:40:00  1146.860  1147.120  1146.860  1147.110   
400 2004-04-13 1900-01-01 09:41:00  1147.020  1147.170  1147.000  1147.170   
401 2004-04-13 1900-01-01 09:42:00  1147.219  1147.250  1147.150  1147.210   
402 2004-04-13 1900-01-01 09:43:00  1147.210  1147.210  1146.969  1146.969   
403 2004-04-13 1900-01-01 09:44:00  1146.850  1146.850  1146.510  1146.510   
404 2004-04-13 1900-01-01 09:45:00  1146.390  1146.510  1146.280  1146.510   
405 2004-04-13 1900-01-01 09:46:00  1146.110  1146.110  1144.819  1144.819   
406 2004-04-13 1900-01-01 09:47:00  1144.439  1144.439  1144.060  1144.060   
407 2004-04-13 1900-01-01 09:48:00  1144.200  1144.350  1144.120  1144.120   
408 2004-04-13 1900-01-01 09:49:00  1143.890  1143.930  1143.890  1143.930   
409 2004-04-13 1900-01-01 09:50:00  1143.910  1144.010  1143.770  1144.010   
410 2004-04-13 1900-01-01 09:51:00  1144.210  1144.360  1144.210  1144.360   
411 2004-04-13 1900-01-01 09:52:00  1144.490  1144.850  1144.490  1144.850   
412 2004-04-13 1900-01-01 09:53:00  1145.110  1145.219  1144.910  1144.910   
413 2004-04-13 1900-01-01 09:54:00  1144.930  1144.969  1144.930  1144.960   
414 2004-04-13 1900-01-01 09:55:00  1144.920  1144.920  1144.770  1144.770   
415 2004-04-13 1900-01-01 09:56:00  1144.830  1144.939  1144.800  1144.800 

I want to calculate the 5-minute returns, that is, log(09:35:00 Close/ 09:31:00 Open ), log(09:40:00 Close/09:35:00 Close),...,log(15:55:00 Close/15:50:00 Close), log(16:00:00 Close/15:55:00 Close). 我想计算5分钟的回报,即log(09:35:00关闭/ 09:31:00打开 ),log(09:40:00关闭/ 09:35:00关闭),... ,日志(15:55:00关闭/ 15:50:00关闭),日志(16:00:00关闭/ 15:55:00关闭)。

And then I want to take the sum of quartic returns. 然后我想得到四次回报的总和。 How can I do this? 我怎样才能做到这一点? Thanks. 谢谢。

If I use datafame.shift(5) and then calculate the returns what I obtain is the rolling 5 minute returns, which is not exactly what I want. 如果我使用datafame.shift(5)然后计算返回,我得到的是滚动的5分钟返回,这不是我想要的。

User pd.TimeGrouper('5T') 用户pd.TimeGrouper('5T')

df = df.set_index(df.Date + (df.Time - pd.to_datetime(df.Time.dt.date)))

cols = ['Open', 'High', 'Low', 'Close']
agg = np.log(df[cols]).groupby(pd.TimeGrouper('5T')).agg(['first', 'last'])
agg.stack(0).T.diff().dropna().squeeze().unstack()

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM