熊猫重新采样ffill（）最后一行

Question

I want to resample a yearly dataframe hourly with the last year included. 我想每小时重新采样一个年度数据帧，其中包括去年。 How can I do that efficiently? 我如何有效地做到这一点？

I have the following dataframe: 我有以下数据框：

df2 = pd.DataFrame({'col' : [2, 3]}, index=['2018', '2019']) 
df2.index=  pd.to_datetime(df2.index)    

df2

            col
2018-01-01        2
2019-01-01        3

Now I resample it hourly and fill the values for each hour of the year with the correponding yearly value. 现在，我每小时对其进行重新采样，并用相应的年度值填充一年中每个小时的值。

df2=df2.resample('h').ffill()
print(df2.head())
print(df2.info())

                        col
    2018-01-01 00:00:00    2
    2018-01-01 01:00:00    2
    2018-01-01 02:00:00    2
    2018-01-01 03:00:00    2
    2018-01-01 04:00:00    2
    <class 'pandas.core.frame.DataFrame'>
    DatetimeIndex: 8761 entries, 2018-01-01 00:00:00 to 2019-01-01 00:00:00
    Freq: H
    Data columns (total 1 columns):
    col    8761 non-null int64
    dtypes: int64(1)
    memory usage: 136.9 KB
    None

My problem is that the forward fill stops at the first hour of 2019. I would like a foward fill that covers the entire year, ie filling all values up until 2019-12-31 23:00:00. 我的问题是，向前填充将在2019年的第一个小时停止。我希望前向填充可以覆盖整个一年，即，填充直到2019-12-31 23:00:00的所有值。 How to do that efficiently? 如何有效地做到这一点？

Many thanks! 非常感谢！

Answer 1

Idea is create new last value with next year, append to DataFrame , resample and last remove last row: 想法是用明年创建新的最后一个值，追加到DataFrame ， resample并最后删除最后一行：

df3 = df2.iloc[[-1]].rename(lambda x: x + pd.offsets.YearBegin())
print (df3)
            col
2020-01-01    3

df2=df2.append(df3).resample('h').ffill().iloc[:-1]
print(df2.tail())
                     col
2019-12-31 19:00:00    3
2019-12-31 20:00:00    3
2019-12-31 21:00:00    3
2019-12-31 22:00:00    3
2019-12-31 23:00:00    3

熊猫重新采样ffill（）最后一行

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-07-20 11:12:36

熊猫重新采样ffill（）最后一行

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-07-20 11:12:36

解决方案1
0 已采纳 2019-07-20 11:12:36