有条件的 NaN 填充不更改列或全部为 None

Question

I have a df with a column, Critic_Score, that has NaN values.我有一个 df，其中有一列 Critic_Score，它具有 NaN 值。 I am trying to replace them with the average of the Critic Scores from the same platform.我试图用来自同一平台的评论家分数的平均值替换它们。 This question has been asked on stack overflow several times and I used 4 suggestions that did not give me the desired output. Please tell me how to fix this.这个问题已经在堆栈溢出上被问过好几次了，我使用了 4 个建议，但没有给我所需的 output。请告诉我如何解决这个问题。

This is a subset of the df:这是 df 的一个子集：

x[['Platform','Critic_Score']].head()

Platform    Critic_Score
0   wii 76.0
1   nes NaN
2   wii 82.0
3   wii 80.0
4   gb  NaN

More information on the original df:有关原始 df 的更多信息：

x.head().to_dict('list')
{'Name': ['wii sports',
  'super mario bros.',
  'mario kart wii',
  'wii sports resort',
  'pokemon red/pokemon blue'],
 'Platform': ['wii', 'nes', 'wii', 'wii', 'gb'],
 'Year_of_Release': [2006.0, 1985.0, 2008.0, 2009.0, 1996.0],
 'Genre': ['sports', 'platform', 'racing', 'sports', 'role-playing'],
 'NA_sales': [41.36, 29.08, 15.68, 15.61, 11.27],
 'EU_sales': [28.96, 3.58, 12.76, 10.93, 8.89],
 'JP_sales': [3.77, 6.81, 3.79, 3.28, 10.22],
 'Other_sales': [8.45, 0.77, 3.29, 2.95, 1.0],
 'Critic_Score': [76.0, nan, 82.0, 80.0, nan],
 'User_Score': ['8', nan, '8.3', '8', nan],
 'Rating': ['E', nan, 'E', 'E', nan]}

These are the statements I tried followed by their output:这些是我在其 output 之后尝试的声明：

1. 1.

x['Critic_Score'] = x['Critic_Score'].fillna(x.groupby('Platform')['Critic_Score'].transform('mean'), inplace = True)

0    None
1    None
2    None
3    None
4    None
Name: Critic_Score, dtype: object

x.loc[x.Critic_Score.isnull(), 'Critic_Score'] = x.groupby('Platform').Critic_Score.transform('mean')
#no change in column
0    76.0
1     NaN
2    82.0
3    80.0
4     NaN

x['Critic_Score'] = x.groupby('Platform')['Critic_Score']\
    .transform(lambda y: y.fillna(y.mean()))
#no change in column
0    76.0
1     NaN
2    82.0
3    80.0
4     NaN
Name: Critic_Score, dtype: float64

x['Critic_Score']=x.groupby('Platform')['Critic_Score'].apply(lambda y:y.fillna(y.mean()))

x['Critic_Score'].head()


Out[73]:
0    76.0
1     NaN
2    82.0
3    80.0
4     NaN
Name: Critic_Score, dtype: float64

Answer 1

x.update(
    x.groupby('Platform').Critic_Score.transform('mean'),
    overwrite=False)

First you create a new df with the same number of rows but with the platform average on every row.首先，您创建一个新的 df，它具有相同的行数，但每行的平台平均值。
Then use that to update the original然后用它来更新原来的

Bear in mind your sample has only one row of nes and another of gb , both with nan score, so there is nothing to be averaged请记住，您的样本只有一行nes和另一行gb ，两者都有nan分数，所以没有什么可以平均的

有条件的 NaN 填充不更改列或全部为 None

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-08-29 22:56:59

有条件的 NaN 填充不更改列或全部为 None

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-08-29 22:56:59

解决方案1
2 已采纳 2020-08-29 22:56:59