Python数据框使用其他列中的信息填充NaN值

Question

I tried to solve this problem on my own, but I unfortunately haven't made much progress and would really appreciate anyone who can help me out. 我试图自己解决这个问题，但不幸的是我没有取得太大进展，并且非常感谢任何可以帮助我的人。

My current dataframe contains 3 columns: 2 healthy columns and one column with some missing values, denoted as NaN. 我当前的数据框包含3列：2个健康列和1个缺少某些值的列，表示为NaN。

df
Out[18]: 
  x1  x2   x3
0  A   1  2.0
1  B   0  NaN
2  A   0  1.0
3  A   1  2.0
4  A   0  NaN
5  B   1  1.0
6  A   1  1.0
7  B   0  2.0
8  B   0  2.0

I would like to fill the missing values in 'x3' by taking the median value of groupby of 'x1' and 'x2'. 我想通过获取“ x1”和“ x2”的groupby的中值来填充“ x3”中的缺失值。

groupby_df = df.groupby(['x1', 'x2'])['x3'].median()

groupby_df
Out[22]: 
x1  x2
A   0     1.0
    1     2.0
B   0     2.0
    1     1.0

So, for instance, the NaN value corresponding to (B, 0) would be replaced by 2 and (A,0) by 1. I unfortunately can't figure out this part. 因此，例如，对应于（B，0）的NaN值将被2替换，而（A，0）则被1替换。不幸的是，我无法弄清楚这部分。 Is there an elegant "DataFrame way" of filling in the NaN values with the computed median using groupby? 是否有一种优雅的“ DataFrame方法”，可以使用groupby用计算出的中位数填充NaN值？

Thank You 谢谢

Answer 1

using fillna inside groupby 在groupby使用fillna

df['x3']=df.groupby(['x1','x2'])['x3'].apply(lambda x : x.fillna(x.median()))
df
Out[928]: 
  x1  x2   x3
0  A   1  2.0
1  B   0  2.0
2  A   0  1.0
3  A   1  2.0
4  A   0  1.0
5  B   1  1.0
6  A   1  1.0
7  B   0  2.0
8  B   0  2.0

Python数据框使用其他列中的信息填充NaN值

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-10-27 17:48:27

Python数据框使用其他列中的信息填充NaN值

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-10-27 17:48:27

解决方案1
0 已采纳 2017-10-27 17:48:27