![](/img/trans.png)
[英]Pandas Resampling: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
[英]Pandas Resampling error: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex
在 DataFrame 上使用 pandas 的resample
function 將刻度數據轉換為 OHLCV 時,遇到重采樣錯誤。
我們應該如何解決錯誤?
# Resample data into 30min bins
bars = data.Price.resample('30min', how='ohlc')
volumes = data.Volume.resample('30min', how='sum')
這給出了錯誤:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'
將索引中的整數時間戳轉換為DatetimeIndex:
data.index = pd.to_datetime(data.index, unit='s')
這會將整數解釋為距紀元以來的秒數。
例如,給定
data = pd.DataFrame(
{'Timestamp':[1313331280, 1313334917, 1313334917, 1313340309, 1313340309],
'Price': [10.4]*3 + [10.5]*2, 'Volume': [0.779, 0.101, 0.316, 0.150, 1.8]})
data = data.set_index(['Timestamp'])
# Price Volume
# Timestamp
# 1313331280 10.4 0.779
# 1313334917 10.4 0.101
# 1313334917 10.4 0.316
# 1313340309 10.5 0.150
# 1313340309 10.5 1.800
data.index = pd.to_datetime(data.index, unit='s')
產量
Price Volume
2011-08-14 14:14:40 10.4 0.779
2011-08-14 15:15:17 10.4 0.101
2011-08-14 15:15:17 10.4 0.316
2011-08-14 16:45:09 10.5 0.150
2011-08-14 16:45:09 10.5 1.800
然后
ticks = data.ix[:, ['Price', 'Volume']]
bars = ticks.Price.resample('30min').ohlc()
volumes = ticks.Volume.resample('30min').sum()
可以計算:
In [368]: bars
Out[368]:
open high low close
2011-08-14 14:00:00 10.4 10.4 10.4 10.4
2011-08-14 14:30:00 NaN NaN NaN NaN
2011-08-14 15:00:00 10.4 10.4 10.4 10.4
2011-08-14 15:30:00 NaN NaN NaN NaN
2011-08-14 16:00:00 NaN NaN NaN NaN
2011-08-14 16:30:00 10.5 10.5 10.5 10.5
In [369]: volumes
Out[369]:
2011-08-14 14:00:00 0.779
2011-08-14 14:30:00 NaN
2011-08-14 15:00:00 0.417
2011-08-14 15:30:00 NaN
2011-08-14 16:00:00 NaN
2011-08-14 16:30:00 1.950
Freq: 30T, Name: Volume, dtype: float64
因為它是為時間序列數據設計的,正如錯誤所說, resample()
僅在索引為 datetime、timedelta 或 period 時才起作用。 以下是此錯誤可能出現的幾種常見方式。
但是,您也可以使用on=
參數將列用作 grouper ,而無需日期時間索引。
df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit='s')
bars = df.resample('30min', on='Timestamp')['Price'].ohlc()
volumes = df.resample('30min', on='Timestamp')['Volume'].sum()
如果您有一個MultiIndex dataframe ,其中一個索引是日期時間,那么您可以使用level=
to select 該級別作為石斑魚。
volumes = df.resample('30min', level='Timestamp')['Volume'].sum()
您還可以使用resample.agg
傳遞多個方法。
resampled = df.resample('30min', on='Timestamp').agg({'Price': 'ohlc', 'Volume': 'sum'})
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.