简体   繁体   中英

Pandas - Replace NaNs in a column with the mean of specific group

I am working with data like the following. The dataframe is sorted by the date:

category  value  Date
0         1      24/5/2019     
1         NaN    24/5/2019         
1         1      26/5/2019       
2         2      1/6/2019      
1         2      23/7/2019       
2         NaN    18/8/2019         
2         3      20/8/2019       
7         3      1/9/2019 
1         NaN    12/9/2019       
2         NaN      13/9/2019       

I would like to replace the "NaN" values with the previous mean for that specific category.

What is the best way to do this in pandas?

Some approaches I considered:

1) This litte riff:

   df['mean' = df.groupby('category')['time'].apply(lambda x: x.shift().expanding().mean()))

source

This gets me the the correct means in but in another column, and it does not replace the NaNs.

2) This riff replaces the NaNs with the average of the columns:

df = df.groupby(df.columns, axis = 1).transform(lambda x: x.fillna(x.mean()))

Source 2

Both of these do not exactly give what I want. If someone could guide me on this it would be much appreciated!

You can replace value by new Series from shift + expanding + mean , first value of 1 group is not replaced, because no previous NaN values exits:

df['Date'] = pd.to_datetime(df['Date'])
s = df.groupby('category')['value'].apply(lambda x: x.shift().expanding().mean())
df['value'] = df['value'].fillna(s)
print (df)
   category  value       Date
0         0    1.0 2019-05-24
1         1    NaN 2019-05-24
2         1    1.0 2019-05-26
3         2    2.0 2019-01-06
4         1    2.0 2019-07-23
5         2    2.0 2019-08-18
6         2    3.0 2019-08-20
7         7    3.0 2019-01-09
8         1    1.5 2019-12-09
9         2    2.5 2019-09-13

You can use pandas.Series.fillna to replace NaN values:

df['value']=df['value'].fillna(df.groupby('category')['value'].transform(lambda x: x.shift().expanding().mean()))
print(df)

   category  value       Date
0         0    1.0  24/5/2019
1         1    NaN  24/5/2019
2         1    1.0  26/5/2019
3         2    2.0   1/6/2019
4         1    2.0  23/7/2019
5         2    2.0  18/8/2019
6         2    3.0  20/8/2019
7         7    3.0   1/9/2019
8         1    1.5  12/9/2019
9         2    2.5  13/9/2019

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM