Individually replace NaN in pandas.dataframe

Question

I have a 900 x 7 dataframe in which 3 fields contain some NaN values.

Instead of simply replacing these values with the some feature average, I have created a function to use an algorithm to estimate the likely value of each NaN based on the other values in that row.

How can I iterate over each NaN to change it's value using my custom function?

My function takes the row ID, the other feature names, and the feature containing the NaN as arguments.

Eg

custom_fillnan(id=0, ins=["val0", "val1", "val2"], out="valn")

Example dataframe:

ID    val0    val1    val2    ...    valn
0      1        2       3     ...    NaN
1      1      NaN       3     ...     4
2      0        0     NaN     ...     1
...

Answer 1

IIUC you could use apply with axis=1 and fillna with your custom function:

In [80]: df
Out[80]: 
   ID  val0  val1  val2  valn
0   0     1     2     3   NaN
1   1     1   NaN     3     4
2   2     0     0   NaN     1


In [83]: df.apply(lambda x: x.fillna(pd.np.mean(x.iloc[1:])), axis=1)
Out[83]: 
   ID  val0      val1      val2  valn
0   0     1  2.000000  3.000000     2
1   1     1  2.666667  3.000000     4
2   2     0  0.000000  0.333333     1

Instead of pd.np.mean you could use your function. x.iloc[1:] is used because, as I understand, you want to use for your function only val columns.

EDIT

If you want to get column names for missing values you could apply or use that function for processing:

def func(x):
    x.loc[x.isnull()] = x.index[x.isnull()]
    return x

In [209]: df.apply(func, axis=1)
Out[209]: 
   ID  val0  val1  val2  valn
0   0     1     2     3  valn
1   1     1  val1     3     4
2   2     0     0  val2     1

Individually replace NaN in pandas.dataframe

Question

1 answers

solution1
2 2015-12-14 20:04:20

Individually replace NaN in pandas.dataframe

Question

1 answers

solution1 2 2015-12-14 20:04:20

solution1
2 2015-12-14 20:04:20