How to deal with NAN values coming from pct_change when another column value is different than previous row

Question

I have a dataframe (df) like so:

Year |  Name  |  Count
2017   John       1
2018   John       2
2019   John       3
2017   Fred       1
2018   Fred       2
2019   Fred       3

Applying the below code, gives me NAN values, how to convert those NAN values into average percentage change based on the values for that group, for example average coming out of 1.0 and 0.5 for John, ie its specific NAN to be replaced with 0.75 = ((1.0+0.5)/2).

df['pct_chg']=df.groupby([df.Name.ne(df.Name.shift()).cumsum(),'Name'])['Count'].\
                                                   apply(lambda x: x.pct_change())
print(df)

   Year  Name  Count  pct_chg
0  2017  John      1      NaN
1  2018  John      2      1.0
2  2019  John      3      0.5
3  2017  Fred      1      NaN
4  2018  Fred      2      1.0
5  2019  Fred      3      0.5

Answer 1

Just creating the new column containing the average value of each group with the example below

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'group': [1,1,1,2,2,2],
    'value': [None, 0.5, 1, None, 0.75, 0.25]
})
df['avg_value'] = df.groupby('group').transform(lambda x: x.mean())

Then, apply np.where function to fill value by condition ( If the value column is null, then fill with avg_value, else using the value column)

df['value'] = np.where(
    df['value'].isna(),
    df['avg_value'],
    df['value']
)

How to deal with NAN values coming from pct_change when another column value is different than previous row

Question

1 answers

solution1
0 2022-05-29 13:10:11

How to deal with NAN values coming from pct_change when another column value is different than previous row

Question

1 answers

solution1 0 2022-05-29 13:10:11

solution1
0 2022-05-29 13:10:11