简体   繁体   English

自动填充 Pandas 中的日期时间,按先前的增量

[英]Autofill datetime in Pandas by previous increment

Given previous datetime values in a Pandas DataFrame--either as an index or as values in a column--is there a way to "autofill" remaining time increments based on the previous fixed increments?给定 Pandas DataFrame 中的先前日期时间值(作为索引或列中的值),有没有办法根据先前的固定增量“自动填充”剩余时间增量?

For example, given:例如,给定:

import pandas as pd
import numpy as np    
df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
                      index = [pd.Timestamp('20130101 09:00:00'),
                               pd.Timestamp('20130101 09:00:05'),
                               pd.Timestamp('20130101 09:00:10'),
                               np.nan,
                               np.nan])

I would like to apply a function to yield:我想申请一个 function 来产生:

B
2013-01-01 09:00:00 2013-01-01 09:00:00 0.0 0.0
2013-01-01 09:00:05 2013-01-01 09:00:05 1.0 1.0
2013-01-01 09:00:10 2013-01-01 09:00:10 2.0 2.0
2013-01-01 09:00:15 2013-01-01 09:00:15 NaN
2013-01-01 09:00:20 2013-01-01 09:00:20 4.0 4.0

Where I have missing timesteps for my last two data points.我缺少最后两个数据点的时间步长。 Here, timesteps are fixed in 5 second increments.在这里,时间步长固定为 5 秒增量。

This will be for thousands of rows.这将用于数千行。 While I might reset_index and then create a function to apply to each row, this seems cumbersome.虽然我可能会 reset_index 然后创建一个 function 以应用于每一行,但这似乎很麻烦。 Is there a slick or built-in way to do this that I'm not finding?有没有我找不到的巧妙或内置的方法来做到这一点?

This solution might work for you,but also use reset_index() fuction.此解决方案可能对您有用,但也使用 reset_index() 功能。

new_dateindex=pd.Series(pd.date_range(start=pd.Timestamp('20130101 09:00:00'),periods=1000,freq='5S'),name='Date')

#'periods=1000' can change to 'periods=len(df.index)' #'periods=1000' 可以改为'periods=len(df.index)'

df.reset_index().join(new_dateindex,how='right')

Assuming the first index value is a valid datetime and all the values are spaced 5s apart, you could do the following:假设第一个索引值是有效的日期时间,并且所有值都间隔 5 秒,您可以执行以下操作:

df.index = pd.date_range(df.index[0], periods=len(df), freq='5s')
>>> df
                       B
2013-01-01 09:00:00  0.0
2013-01-01 09:00:05  1.0
2013-01-01 09:00:10  2.0
2013-01-01 09:00:15  NaN
2013-01-01 09:00:20  4.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM