简体   繁体   English

如何为每秒 200 个观测值生成 datetimeindex?

[英]How to generate datetimeindex for 200 observations per second?

I have data from many sensors, and observations come 200 times every second.我有很多传感器的数据,每秒有 200 次观察。 Now I want to resample at a lower rate, so make the dataset manageable calculation wise.现在我想以较低的速率重新采样,因此明智地使数据集易于管理计算。 But The time column is absolute and date time.但是时间列是绝对的和日期时间。 Please see the first column below.请参阅下面的第一列。 Now I want to create an index in absolute datetime so that I can use resample() methods easily to resampling and aggregation at different durations.现在我想在绝对日期时间中创建一个索引,以便我可以轻松地使用 resample() 方法在不同的持续时间进行重采样和聚合。

Example:例子:

0.000000    1.397081    -0.672387   0.552749        

0.005000    2.374832    -0.221770   1.348744    

0.010000    3.191852    0.776504    0.044648    

0.015000    2.304027    0.188047    0.433253

0.020000    2.331740    -0.000074   0.424112    

0.025000    2.869129    0.282714    1.081615

0.030000    3.312915    0.997374    0.456503

0.035000    2.044041    -0.114705   0.993204

I want a method to generate timestamps 200 times a second starting at a timestamp, when this run of experiment was started, 2020/03/14 23:49:19 for example.我想要一种从时间戳开始每秒生成 200 次时间戳的方法,例如,当这次实验运行开始时,2020/03/14 23:49:19。 Starting at 2020/03/14 23:49:19 I want to generate time stamps 200 times every second.从 2020/03/14 23:49:19 开始,我想每秒生成 200 次时间戳。 This will help me generate a DatetimeIndex and then resample and aggregate it to 10 times a second.这将帮助我生成一个 DatetimeIndex,然后重新采样并将其聚合到每秒 10 次。

I could find no example at this frequency and granularity, after reading the date functionality pages at pandas, https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamps-vs-time-spans在阅读pandas、https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamps-vs-time-spans的日期功能页面后,我找不到这种频率和粒度的示例

the real datafiles are of course extremely big, and confidential so can not post it.真正的数据文件当然非常大,而且是机密所以不能贴出来。

assuming we have for example假设我们有

df
Out[52]: 
       t        v1        v2        v3
0  0.000  1.397081 -0.672387  0.552749
1  0.005  2.374832 -0.221770  1.348744
2  0.010  3.191852  0.776504  0.044648
3  0.015  2.304027  0.188047  0.433253
4  0.020  2.331740 -0.000074  0.424112
5  0.025  2.869129  0.282714  1.081615
6  0.030  3.312915  0.997374  0.456503
7  0.035  2.044041 -0.114705  0.993204

we can define a start date/time and add the existing time axis as a timedelta (assuming seconds here) and set that as index:我们可以定义开始日期/时间并将现有时间轴添加为时间增量(此处假设为秒)并将其设置为索引:

start = pd.Timestamp("2020/03/14 23:49:19")

df.index = pd.DatetimeIndex(start + pd.to_timedelta(df['t'], unit='s'))

df
Out[55]: 
                             t        v1        v2        v3
t                                                           
2020-03-14 23:49:19.000  0.000  1.397081 -0.672387  0.552749
2020-03-14 23:49:19.005  0.005  2.374832 -0.221770  1.348744
2020-03-14 23:49:19.010  0.010  3.191852  0.776504  0.044648
2020-03-14 23:49:19.015  0.015  2.304027  0.188047  0.433253
2020-03-14 23:49:19.020  0.020  2.331740 -0.000074  0.424112
2020-03-14 23:49:19.025  0.025  2.869129  0.282714  1.081615
2020-03-14 23:49:19.030  0.030  3.312915  0.997374  0.456503
2020-03-14 23:49:19.035  0.035  2.044041 -0.114705  0.993204

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM