[英]Pandas custom re-sample for time series data
I have a time series data in 1 Min frequency. 我有1分钟频率的时间序列数据。 I would like re-sample the data for every 5 min and re-sample data should include the data of first time step, middle time step and last time step.
我想每5分钟重新采样一次数据,并且重新采样的数据应包括第一时间步长,中间时间步长和最后时间步长的数据。
I have tried like this, but I am not getting what I am expecting... 我已经尝试过这种方法,但是却没有达到我的期望...
def my_fun(array)
return array[0],array[-1]
df=pd.DataFrame(np.arange(60),index=pd.date_range('2017-01-01 00:00','2017-01-01 00:59', freq='1T'
df.resample('5T').apply(my_fun)
If I understood you correctly then you want the data for minutes 0,2,4,5,7,9,10,... in a new dataframe. 如果我对您的理解正确,那么您希望在新数据框中记录分钟0,2,4,5,7,9,10,...的数据。 A faster way than using resample may be:
比使用重采样更快的方法可能是:
df=pd.DataFrame(np.arange(60),index=pd.date_range('2017-01-01 00:00','2017-01-01 00:59', freq='1T'))
l = len(df)
df.loc[df.iloc[range(2,l,5)].index | df.iloc[range(4,l,5)].index | df.iloc[range(0,l,5)].index]
Output: 输出:
0
2017-01-01 00:00:00 0
2017-01-01 00:02:00 2
2017-01-01 00:04:00 4
2017-01-01 00:05:00 5
2017-01-01 00:07:00 7
2017-01-01 00:09:00 9
2017-01-01 00:10:00 10
If you just wanted a combined list of your selected data in one row then you were almost there: 如果您只想将所选数据的组合列表排成一行,那么您就快到了:
def my_fun(array):
return [array[0], array[2], array[4]]
df=pd.DataFrame({'0':np.arange(60)}, index=pd.date_range('2017-01-01 00:00','2017-01-01 00:59', freq='1T'))
df.resample('5T').apply(my_fun)
Output: 输出:
0
2017-01-01 00:00:00 (0, 2, 4)
2017-01-01 00:05:00 (5, 7, 9)
2017-01-01 00:10:00 (10, 12, 14)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.