简体   繁体   中英

Pandas custom re-sample for time series data

I have a time series data in 1 Min frequency. I would like re-sample the data for every 5 min and re-sample data should include the data of first time step, middle time step and last time step.

I have tried like this, but I am not getting what I am expecting...

def my_fun(array)
     return array[0],array[-1]


df=pd.DataFrame(np.arange(60),index=pd.date_range('2017-01-01 00:00','2017-01-01 00:59', freq='1T'

df.resample('5T').apply(my_fun)

If I understood you correctly then you want the data for minutes 0,2,4,5,7,9,10,... in a new dataframe. A faster way than using resample may be:

df=pd.DataFrame(np.arange(60),index=pd.date_range('2017-01-01 00:00','2017-01-01 00:59', freq='1T'))
l = len(df)
df.loc[df.iloc[range(2,l,5)].index | df.iloc[range(4,l,5)].index | df.iloc[range(0,l,5)].index]

Output:

                        0
2017-01-01 00:00:00     0
2017-01-01 00:02:00     2
2017-01-01 00:04:00     4
2017-01-01 00:05:00     5
2017-01-01 00:07:00     7
2017-01-01 00:09:00     9
2017-01-01 00:10:00     10

If you just wanted a combined list of your selected data in one row then you were almost there:

def my_fun(array):
      return [array[0], array[2], array[4]]

df=pd.DataFrame({'0':np.arange(60)}, index=pd.date_range('2017-01-01 00:00','2017-01-01 00:59', freq='1T'))
df.resample('5T').apply(my_fun)

Output:

                        0
2017-01-01 00:00:00     (0, 2, 4)
2017-01-01 00:05:00     (5, 7, 9)
2017-01-01 00:10:00     (10, 12, 14)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM