[英]Filling missing dates by imputing on previous dates in Python
我有一個時間序列,希望滯后並預測未來一年的未來數據,如下所示:
Date Energy Pred Energy Lag Error
.
2017-09-01 9 8.4
2017-10-01 10 9
2017-11-01 11 10
2017-12-01 12 11.5
2018-01-01 1 1.3
NaT (pred-true)
NaT
NaT
NaT
.
.
我要做的就是將日期輸入到NaT條目中,以便從2018-01-01到2019-01-01(就像我們在Excel拖放中一樣填充它們),因為有足夠的NaT位置來填充那一點。
我用各種方法嘗試過model['Date'].fillna()
,或者只是重復相同的先前日期,或者刪除了我不想刪除的內容。
有什么辦法像以前的數據那樣以1個月的增量填充這些NaT?
創建df並設置索引(有更好的方法來設置索引):
"""
Date,Energy,Pred Energy,Lag Error
2017-09-01,9,8.4
2017-10-01,10,9
2017-11-01,11,10
2017-12-01,12,11.5
2018-01-01,1,1.3
"""
import pandas as pd
df = pd.read_clipboard(sep=",", parse_dates=True)
df.set_index(pd.DatetimeIndex(df['Date']), inplace=True)
df.drop("Date", axis=1, inplace=True)
df
重新date_range
到新的date_range
:
idx = pd.date_range(start='2017-09-01', end='2019-01-01', freq='MS')
df = df.reindex(idx)
輸出:
Energy Pred Energy Lag Error
2017-09-01 9.0 8.4 NaN
2017-10-01 10.0 9.0 NaN
2017-11-01 11.0 10.0 NaN
2017-12-01 12.0 11.5 NaN
2018-01-01 1.0 1.3 NaN
2018-02-01 NaN NaN NaN
2018-03-01 NaN NaN NaN
2018-04-01 NaN NaN NaN
2018-05-01 NaN NaN NaN
2018-06-01 NaN NaN NaN
2018-07-01 NaN NaN NaN
2018-08-01 NaN NaN NaN
2018-09-01 NaN NaN NaN
2018-10-01 NaN NaN NaN
2018-11-01 NaN NaN NaN
2018-12-01 NaN NaN NaN
2019-01-01 NaN NaN NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.