[英]How to fill an missing values in a column based on another column
I have a dataframe called shoes 我有一个称为鞋的数据框
Brand Comment
Ugg NaN
Prada NaN
Clarks NaN
Ugg NaN
Clark NaN
Prada Made from horse leather
Prada Made from pig leather
Prada NaN
Ugg Made from Australian cow leather
...
and another dataframe df_mode which was obtained by taking the mode of the comments for each shoe brand in the shoes dataframe for nonnull values 另一个数据框df_mode,该数据框是通过在鞋数据框中获取非空值的每个鞋品牌的注释模式而获得的
Brand Comment
Ugg Made from sheep
Prada Made from pig leather
Clarks Made from Cow leather
How can I assign the missing values for each shoe brand in the shoes dataframe with its respective mode response shown in the df_mode dataframe. 如何在鞋子数据框中为每个鞋子品牌分配缺失值,并在df_mode数据框中显示其相应的模式响应。
This is basically what I'm trying to achieve 这基本上就是我要实现的目标
Brand Comment
Ugg Made from sheep
Prada Made from pig leather
Clarks Made from Cow leather
Ugg Made from sheep
Clark Made from Cow leather
Prada Made from horse leather
Prada Made from pig leather
Prada Made from pig leather
Ugg Made from Australian cow leather
使用np.where
shoes['Comment']=np.where(shoes['Comment'].isnull(),shoes['Brand'].map(dict(zip(df_mode['Brand']))),df_mode['Comment'],shoes['Comment'])
使用loc
和map
:
shoes.loc[shoes.Comment.isna(), 'Comment'] = shoes.Brand.map(df_mode.set_index('Brand')['Comment'])
you can first groupby
by Brand column, then fill the missing values. 您可以groupby
品牌”列进行groupby
,然后填写缺失值。 here is the implementation: 这是实现:
df['Comment'] = df.groupby(['Brand'], sort=False)['Comment'].apply(lambda x: x.ffill().bfill())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.