简体   繁体   中英

How to resample irregular time series to daily frequency and have it span to today?

I have an irregularly spaced (with respect to time frequency) pandas data frame. I can successfully up-sample the data frame to a daily frequency using the resample command, however my problem is that the resampling ends at the last (pre-resampled) data observation. I would like the resampling to span all the way to today's date.

For example, here is the irregular dataframe:

data
Out[1]: 
            Var 1     Var 2   Var 3     Var 4
Dates                                        

2017-09-20   16.0  1.328125   1.375  0.135976
2017-12-13   16.0  1.343750   1.375  0.085391
2018-03-21   15.0  2.191667   2.125  0.274946
2018-06-13   15.0  2.241667   2.375  0.208452
2018-09-26   16.0  4.312500   2.375  0.111803
2018-12-19   17.0  4.279412   2.375  0.083026
2019-03-20   17.0  3.507353   2.375  0.179358

I used

dset = data.resample('D', convention = 'end').ffill()

which results (the tail end) in

dset.tail()
Out[2]: 
            Var 1     Var 2   Var 3     Var 4
Dates                                        
2019-03-16   17.0  4.279412   2.375  0.083026
2019-03-17   17.0  4.279412   2.375  0.083026
2019-03-18   17.0  4.279412   2.375  0.083026
2019-03-19   17.0  4.279412   2.375  0.083026
2019-03-20   17.0  3.507353   2.375  0.179358

which is great, except that the last "upsampling" ended on 3/20/2019, but I would like for it to end on 4/13/2019 (today's date). As you can see, the type of resampling I am after is to simply take the data from the irregular series and repeat it daily until the next (irregular) data point, from which the new observation is repeated until the next (irregular) data point, etc.

I am sure I am doing something stupid/not adding a simple addendum to the command. I would prefer to stay within pandas, if possible.

I would like the finished data to look like:

dset.tail()
Out[2]: 
            Var 1     Var 2   Var 3     Var 4
Dates                                        
2019-03-20   17.0  3.507353   2.375  0.179358
2019-03-21   17.0  3.507353   2.375  0.179358
2019-03-22   17.0  3.507353   2.375  0.179358

more days, repeated

2019-04-11   17.0  3.507353   2.375  0.179358
2019-04-12   17.0  3.507353   2.375  0.179358
2019-04-13   17.0  3.507353   2.375  0.179358

Thank you all either way for any help/hints provided.

Use DataFrame.reindex with pandas.date_range method:

dset = data.reindex(
           pd.date_range(start=data.index.min(),
                         end=pd.datetime.today(),
                         freq='D'),
           method='ffill')

[output]

            Var 1     Var 2  Var 3     Var 4
2017-09-20   16.0  1.328125  1.375  0.135976
2017-09-21   16.0  1.328125  1.375  0.135976
2017-09-22   16.0  1.328125  1.375  0.135976
2017-09-23   16.0  1.328125  1.375  0.135976
2017-09-24   16.0  1.328125  1.375  0.135976
2017-09-25   16.0  1.328125  1.375  0.135976
2017-09-26   16.0  1.328125  1.375  0.135976
2017-09-27   16.0  1.328125  1.375  0.135976
2017-09-28   16.0  1.328125  1.375  0.135976
2017-09-29   16.0  1.328125  1.375  0.135976
...
2019-04-04   17.0  3.507353  2.375  0.179358
2019-04-05   17.0  3.507353  2.375  0.179358
2019-04-06   17.0  3.507353  2.375  0.179358
2019-04-07   17.0  3.507353  2.375  0.179358
2019-04-08   17.0  3.507353  2.375  0.179358
2019-04-09   17.0  3.507353  2.375  0.179358
2019-04-10   17.0  3.507353  2.375  0.179358
2019-04-11   17.0  3.507353  2.375  0.179358
2019-04-12   17.0  3.507353  2.375  0.179358
2019-04-13   17.0  3.507353  2.375  0.179358

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM