简体   繁体   中英

How to generate datetimeindex for 200 observations per second?

I have data from many sensors, and observations come 200 times every second. Now I want to resample at a lower rate, so make the dataset manageable calculation wise. But The time column is absolute and date time. Please see the first column below. Now I want to create an index in absolute datetime so that I can use resample() methods easily to resampling and aggregation at different durations.

Example:

0.000000    1.397081    -0.672387   0.552749        

0.005000    2.374832    -0.221770   1.348744    

0.010000    3.191852    0.776504    0.044648    

0.015000    2.304027    0.188047    0.433253

0.020000    2.331740    -0.000074   0.424112    

0.025000    2.869129    0.282714    1.081615

0.030000    3.312915    0.997374    0.456503

0.035000    2.044041    -0.114705   0.993204

I want a method to generate timestamps 200 times a second starting at a timestamp, when this run of experiment was started, 2020/03/14 23:49:19 for example. Starting at 2020/03/14 23:49:19 I want to generate time stamps 200 times every second. This will help me generate a DatetimeIndex and then resample and aggregate it to 10 times a second.

I could find no example at this frequency and granularity, after reading the date functionality pages at pandas, https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamps-vs-time-spans

the real datafiles are of course extremely big, and confidential so can not post it.

assuming we have for example

df
Out[52]: 
       t        v1        v2        v3
0  0.000  1.397081 -0.672387  0.552749
1  0.005  2.374832 -0.221770  1.348744
2  0.010  3.191852  0.776504  0.044648
3  0.015  2.304027  0.188047  0.433253
4  0.020  2.331740 -0.000074  0.424112
5  0.025  2.869129  0.282714  1.081615
6  0.030  3.312915  0.997374  0.456503
7  0.035  2.044041 -0.114705  0.993204

we can define a start date/time and add the existing time axis as a timedelta (assuming seconds here) and set that as index:

start = pd.Timestamp("2020/03/14 23:49:19")

df.index = pd.DatetimeIndex(start + pd.to_timedelta(df['t'], unit='s'))

df
Out[55]: 
                             t        v1        v2        v3
t                                                           
2020-03-14 23:49:19.000  0.000  1.397081 -0.672387  0.552749
2020-03-14 23:49:19.005  0.005  2.374832 -0.221770  1.348744
2020-03-14 23:49:19.010  0.010  3.191852  0.776504  0.044648
2020-03-14 23:49:19.015  0.015  2.304027  0.188047  0.433253
2020-03-14 23:49:19.020  0.020  2.331740 -0.000074  0.424112
2020-03-14 23:49:19.025  0.025  2.869129  0.282714  1.081615
2020-03-14 23:49:19.030  0.030  3.312915  0.997374  0.456503
2020-03-14 23:49:19.035  0.035  2.044041 -0.114705  0.993204

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM