Assume some measurement data (in reality given about every minute) named logData
:
import pandas as pd, numpy as np
idxData = pd.to_datetime(['08:00', '08:15', '08:30', '08:45', '09:00'])
logData = pd.DataFrame(np.array([1.0, 2.0, 3.0, 4.0, 5.0]), columns=['val'], index=idxData)
idxRng = pd.interval_range(idxData[0], idxData[-1], freq='30min')
avgData = logData.groupby( pd.cut(logData.index, idxRng) ).mean()
The data is grouped into avgData
eg looking like this:
val
(08:00:00, 08:30:00] 2.5
(08:30:00, 09:00:00] 4.5
This downsampled avgData
should now (after performing some other calculations) be upsampled again, eg to a frequency of freq='10min'
for further calculations. Since avgData.resample('10min')
throws the following error, the question is how to resample categorical data ?
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'CategoricalIndex'
Many thanks in advance!
为了使重新采样工作,您的索引需要具有 datetime64[ns] 数据类型 通过运行以下代码检查索引的数据类型。
avgData.index.dtype
It took my a little while to figure out how to meaningfully convert a categorical index, but index.categories.mid
seems to work, allowing to resample the data via
avgData.set_index( pd.DatetimeIndex( avgData.index.categories.mid ), inplace=True)
avgData = avgData.resample('5min').interpolate(method='nearest')
which yields the expected result:
val
08:15:00 2.5
08:20:00 2.5
08:25:00 2.5
08:30:00 2.5
08:35:00 4.5
08:40:00 4.5
08:45:00 4.5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.