I have a dataframe (df) like so:
Year | Name | Count
2017 John 1
2018 John 2
2019 John 3
2017 Fred 1
2018 Fred 2
2019 Fred 3
Applying the below code, gives me NAN values, how to convert those NAN values into average percentage change based on the values for that group, for example average coming out of 1.0 and 0.5 for John, ie its specific NAN to be replaced with 0.75 = ((1.0+0.5)/2).
df['pct_chg']=df.groupby([df.Name.ne(df.Name.shift()).cumsum(),'Name'])['Count'].\
apply(lambda x: x.pct_change())
print(df)
Year Name Count pct_chg
0 2017 John 1 NaN
1 2018 John 2 1.0
2 2019 John 3 0.5
3 2017 Fred 1 NaN
4 2018 Fred 2 1.0
5 2019 Fred 3 0.5
Just creating the new column containing the average value of each group with the example below
import pandas as pd
import numpy as np
df = pd.DataFrame({
'group': [1,1,1,2,2,2],
'value': [None, 0.5, 1, None, 0.75, 0.25]
})
df['avg_value'] = df.groupby('group').transform(lambda x: x.mean())
Then, apply np.where
function to fill value by condition ( If the value column is null, then fill with avg_value, else using the value column)
df['value'] = np.where(
df['value'].isna(),
df['avg_value'],
df['value']
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.