简体   繁体   English

使用“填充”填充缺失的数据

[英]Filling in missing data using "ffill"

I have the following data我有以下数据

4/23/2021   493107
4/26/2021   485117
4/27/2021   485117
4/28/2021   485117
4/29/2021   485117
4/30/2021   485117
5/7/2021    484691

I want it to look like the following:我希望它看起来像下面这样:

4/23/2021   493107
4/24/2021    485117
4/25/2021    485117
4/26/2021   485117
4/27/2021   485117
4/28/2021   485117
4/29/2021   485117
4/30/2021   485117
5/1/2021    484691
5/2/2021    484691
5/3/2021    484691
5/4/2021    484691
5/5/2021    484691
5/6/2021    484691
5/7/2021    484691

So it uses date below to fill in the missing data.所以它使用下面的日期来填写缺失的数据。 I tried the following code:我尝试了以下代码:

 df['Date']=pd.to_datetime(df['Date'].astype(str), format='%m/%d/%Y')   
 df.set_index(df['Date'], inplace=True)    
 df = df.resample('D').sum().fillna(0)
 df['crude'] = df['crude'].replace({ 0:np.nan})
 df['crude'].fillna(method='ffill', inplace=True)

However, this results in taking the data above and getting the following:但是,这会导致获取上述数据并获得以下结果:

4/23/2021   493107
4/24/2021   493107
4/25/2021   493107
4/26/2021   485117
4/27/2021   485117
4/28/2021   485117
4/29/2021   485117
4/30/2021   485117
5/1/2021    485117
5/2/2021    485117
5/3/2021    485117
5/4/2021    485117
5/5/2021    485117
5/6/2021    485117
5/7/2021    969382

Which does not match what I need the output to be.这与我需要的 output 不匹配。

Set the index of the dataframe to Date , then using asfreq conform/reindex the index of the dataframe to daily frequency providing fill method as backward fill将 dataframe 的索引设置为Date ,然后使用asfreq将 dataframe 的索引设置为每日频率,提供填充方法作为反向填充

df.set_index('Date').asfreq('D', method='bfill')

             crude
Date              
2021-04-23  493107
2021-04-24  485117
2021-04-25  485117
2021-04-26  485117
2021-04-27  485117
2021-04-28  485117
2021-04-29  485117
2021-04-30  485117
2021-05-01  484691
2021-05-02  484691
2021-05-03  484691
2021-05-04  484691
2021-05-05  484691
2021-05-06  484691
2021-05-07  484691

Try replace 0 with bfill instead of ffill :尝试用bfill而不是ffill替换 0 :

import pandas as pd

df = pd.DataFrame({
    'crude': {'4/23/2021': 493107, '4/26/2021': 485117,
              '4/27/2021': 485117, '4/28/2021': 485117,
              '4/29/2021': 485117, '4/30/2021': 485117,
              '5/7/2021': 484691}
})
df.index = pd.to_datetime(df.index)

df = df.resample('D').sum()

df['crude'] = df['crude'].replace(0, method='bfill')

print(df)

df : df

             crude
2021-04-23  493107
2021-04-24  485117
2021-04-25  485117
2021-04-26  485117
2021-04-27  485117
2021-04-28  485117
2021-04-29  485117
2021-04-30  485117
2021-05-01  484691
2021-05-02  484691
2021-05-03  484691
2021-05-04  484691
2021-05-05  484691
2021-05-06  484691
2021-05-07  484691

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM