[英]Resample OHLC data with pandas
There are a lot of similar questions, all of them with they specific issues and answers, but I haven't found a fitting solution, nor an understanding on how to do it.有很多类似的问题,所有这些问题都有特定的问题和答案,但我还没有找到合适的解决方案,也不了解如何去做。
I have typical data:我有典型的数据:
date open high low close volume spot
1507842000 5313.3 5345.6 5272 5295.1 22612561 5301.462201
1507845600 5295.1 5326.7 5286.1 5301.1 12127159 5308.487754
1507849200 5301.1 5467.5 5301.1 5464.5 54568881 5401.331605
1507852800 5464.7 5497 5394.9 5402.5 58411322 5446.552171
1507856400 5402.1 5542 5402.1 5541.2 50272286 5466.652636
1507860000 5540.4 5980 5440.1 5694.5 182746217 5717.856124
1507863600 5689.8 5800 5604.5 5739.6 78341266 5709.488508
1507867200 5742 5897 5713.1 5753.2 79738461 5794.402674
1507870800 5753.1 5798.9 5520.3 5574.5 87621428 5640.727381
1507874400 5574.6 5672.6 5503.2 5608.4 56964404 5591.237093
1507878000 5607.5 5689.1 5570 5660 46132190 5640.761482
1507881600 5660 5743 5634.8 5652 50173714 5690.219952
but not just OHLC, but also volume and spot price.但不仅仅是 OHLC,还有数量和现货价格。
I am trying to resample hours to days.我正在尝试重新采样数小时到数天。
so, I load the csv:所以,我加载了 csv:
data_hourly = pd.read_csv('../data/hourly.csv', parse_dates=True, date_parser=date_parse, index_col=0, header=0)
(the date_parse function is removing the minutes / seconds) (date_parse 函数正在删除分钟/秒)
I tried:我试过:
data_daily = data_hourly.resample('1D').ohlc()
and, this clearly doesn't work at all;而且,这显然根本行不通; giving me rows with a large amount of columns.给我包含大量列的行。
and I tried:我试过:
columns_dict = {'open': 'first', 'high': 'max', 'low': 'min', 'close': 'last', 'volume': 'sum', 'spot': 'average'}
data_daily = data_hourly.resample('1D', how=columns_dict) data_daily = data_hourly.resample('1D', how=columns_dict)
but this crashes with an error:但这会因错误而崩溃:
"%r object has no attribute %r" % (type(self). name , attr) AttributeError: 'SeriesGroupBy' object has no attribute 'average' “%r 对象没有属性 %r” % (type(self). name , attr) AttributeError: 'SeriesGroupBy' 对象没有属性 'average'
besides, it tells me the 'how' field is deprecated anyways, but I didn't see a sample to do it the 'new' way.此外,它告诉我无论如何都不推荐使用“如何”字段,但我没有看到以“新”方式执行此操作的示例。
You are close, need mean
instead average
and pass it to Resampler.agg
:您很接近,需要mean
而不是average
并将其传递给Resampler.agg
:
columns_dict = {'open': 'first', 'high': 'max', 'low': 'min',
'close': 'last', 'volume': 'sum', 'spot': 'mean'}
data_daily = data_hourly.resample('1D').agg(columns_dict)
print (data_daily)
open high low close volume spot
date
2017-10-12 5313.3 5467.5 5272.0 5464.5 89308601 5337.093853
2017-10-13 5464.7 5980.0 5394.9 5652.0 690401288 5633.099780
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.