熊猫盘中8Min重采样错误？

Question

It seems that for 1Min bar data, resample() with sampling frequency of any multiple of 8 has a bug. 似乎对于1Min条数据，采样频率为8的任意倍数的resample（）都有一个错误。 The code below illustrates the bug when resampling is done at [3, 5, 6, 8, 16] Min. 下面的代码说明了在[3、5、6、8、16]分钟进行重新采样时的错误。 For both 3 and 5 frequency, the first entry of the resampled dataframe index starts at the base timestamp (9:30 in this case) while for frequencies 8 and 16, the resampled index starts at 9:26 and 9:18 respectively. 对于3和5频率，重新采样的数据帧索引的第一项始于基本时间戳（在本例中为9:30），而对于频率8和16，重新采样的索引分别于9:26和9:18开始。

import pandas as pd
import datetime as dt
import numpy as np

datetime_start = dt.datetime(2014, 9, 1, 9, 30)
datetime_end = dt.datetime(2014, 9, 1, 16, 0)

tt = pd.date_range(datetime_start, datetime_end, freq='1Min')
df = pd.DataFrame(np.arange(len(tt)), index=tt, columns=['A'])

for freq in [3, 5, 6, 8, 16]:
    print freq
    print df.resample(str(freq) + 'Min', how='first', base=30).head(2)

Produces the following output: 产生以下输出：

3
                     A
2014-09-01 09:30:00  0
2014-09-01 09:33:00  3
5
                     A
2014-09-01 09:30:00  0
2014-09-01 09:35:00  5
6
                     A
2014-09-01 09:30:00  0
2014-09-01 09:36:00  6
8
                     A
2014-09-01 09:26:00  0
2014-09-01 09:34:00  4
16
                     A
2014-09-01 09:18:00  0
2014-09-01 09:34:00  4

Answer 1

I think resample is base on 00:00:00 so I using offset index to 00:00 then resample. 我认为重采样基于00:00:00，因此我将偏移量索引设置为00:00，然后重新采样。

method 1 方法1

import pandas as pd
import datetime as dt
import numpy as np

datetime_start = dt.datetime(2014, 9, 1, 9, 30)
datetime_end = dt.datetime(2014, 9, 1, 16, 30)

tt = pd.date_range(datetime_start, datetime_end, freq='1Min')
df = pd.DataFrame(np.arange(len(tt)), index=tt, columns=['A'])

offsets = pd.offsets.Hour(9) + pd.offsets.Minute(30)
for freq in [1,3,5,6,8, 16]:
    print(freq)
    df.index = df.index - offsets
    df = df.resample(str(freq) + 'T').agg({'A':'first'})
    df.index = df.index + offsets
    print(df.head(2))

method 2 : using base like index offsets. 方法2：使用基本索引偏移量。

import pandas as pd
import datetime as dt
import numpy as np

datetime_start = dt.datetime(2014, 9, 1, 9, 30)
datetime_end = dt.datetime(2014, 9, 1, 16, 30)

tt = pd.date_range(datetime_start, datetime_end, freq='1Min')
df = pd.DataFrame(np.arange(len(tt)), index=tt, columns=['A'])

for freq in [1,3,5,6,8, 16]:
    print(freq)
    df = df.resample(str(freq) + 'T',base=9*60+30).agg({'A':'first'})
    print(df.head(2))

then output 然后输出

1
                     A
2014-09-01 09:30:00  0
2014-09-01 09:31:00  1
3
                     A
2014-09-01 09:30:00  0
2014-09-01 09:33:00  3
5
                     A
2014-09-01 09:30:00  0
2014-09-01 09:35:00  6
6
                      A
2014-09-01 09:30:00   0
2014-09-01 09:36:00  12
8
                      A
2014-09-01 09:30:00   0
2014-09-01 09:38:00  15
16
                      A
2014-09-01 09:30:00   0
2014-09-01 09:46:00  21

熊猫盘中8Min重采样错误？

问题描述

1 个解决方案

解决方案1
0 2017-06-09 15:56:33

熊猫盘中8Min重采样错误？

问题描述

1 个解决方案

解决方案1 0 2017-06-09 15:56:33

解决方案1
0 2017-06-09 15:56:33