简体   繁体   中英

How can i fill nan values in a df using group mean?

I can fill the missing data for numerical values based on the following python code

df.fillna(df.select_dtypes(include='number').mean().iloc[0], inplace=True)

But this will only fill Nan with the overall mean. I have a column with categorical variables and I need to fill the mean values based on the categories in this column.

您可以使用groupby().transform()将组的平均值放置在每一行,然后您可以fillna

df.fillna(df.groupby('category_column').transform('mean'), inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM