I can fill the missing data for numerical values based on the following python code
df.fillna(df.select_dtypes(include='number').mean().iloc[0], inplace=True)
But this will only fill Nan with the overall mean. I have a column with categorical variables and I need to fill the mean values based on the categories in this column.
您可以使用groupby().transform()
将组的平均值放置在每一行,然后您可以fillna
:
df.fillna(df.groupby('category_column').transform('mean'), inplace=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.