简体   繁体   中英

Using pandas replace all empty values with last row based on previous month last value in a group by condition

I have a dataframe as shown below

Using pandas dataframe I want to replace empty values in a column from first row in a groupby condition based on previous month last value

till last date exists based on each ID Sector Usage, price column value should be filled.

ID    Sector    Usage     Price   Date 
1     A         R         20      29/08/2022
1     A         R         30      30/08/2022
1     A         R         40      31/08/2022
1     A         R                 01/09/2022
1     A         R                 02/09/2022
.     .         .          .          . 
.     .         .          .          . 
1     A         R                 30/09/2022
.     .         .          .          . 
.     .         .          .          .
1     A         R                 31/10/2022
.     .         .          .          . 
.     .         .          .          .
1     A         R                 30/11/2022
2     B         C         200     31/08/2022
3     B         R         60      31/08/2022

expected_output

ID    Sector    Usage     Price    Date
   
1     A         R         20      29/08/2022
1     A         R         30      30/08/2022
1     A         R         40      31/08/2022
1     A         R                 01/09/2022
1     A         R                 02/09/2022
.     .         .          .          . 
.     .         .          .          . 
1     A         R          40       30/09/2022
.     .         .          .          . 
.     .         .          .          .
1     A         R          40       31/10/2022
.     .         .          .          . 
.     .         .          .          .
1     A         R          40      30/11/2022
2     B         C          200      31/08/2022
2     B         C          200      01/09/2022
.     .         .          .          . 
.     .         .          .          . 
2     B         C          200      31/10/2022
.     .         .          .          . 
.     .         .          .          . 
2     B         C          200      31/12/2022
3     B         R          60       31/08/2022

I have tried below codes but not working

m = df['Price'] == ''
s = df.assign(Price=df['Price'].mask(m)).groupby(['Sector','Usage'])['Price'].ffill()
df['Price'] = np.where(m, s, df['Price']).astype(int)

or

df.replace({'Price': {0:np.NaN}}).ffill()

Assuming your empty values are empty strings:

import pandas as pd
import numpy as np

df = pd.DataFrame.from_dict({"fills": [100, 200, "", 40, "", 5]})

df["fills"].replace("", np.nan, regex=True).fillna(method="ffill")

Output:

  fills
0   100
1   200
2      
3    40
4      
5     5

   fills
0    100.0
1    200.0
2    200.0
3     40.0
4     40.0
5      5.0

And with groupby / transform:

df = pd.DataFrame.from_dict({"fills": [100, 200, "", 40, "", 5], "grps": ["A", "B", "C", "A", "A", "B"]})

df["fills"] = df.groupby(by=["grps"])["fills"].transform(lambda x: x.replace("", np.nan)).fillna(method="ffill")

Output:

  fills grps
0   100    A
1   200    B
2          C
3    40    A
4          A
5     5    B
    
   fills grps
0  100.0    A
1  200.0    B
2  200.0    C
3   40.0    A
4   40.0    A
5    5.0    B

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM