In a column, fill values that are not a number with “NaN”

Question

I have a DataFrame with a certain column with values as below:

index     some_column
 0          12345
 1          23549
 2          .....
 3          78516
 4          98713
 5          .....

I want to check the values in the column and if the value is not a number (ie if the value is "....."), then I want to fill that value with np.NaN.

I've tried the function below:

from numbers import Number
def fill_in(values):
    if isinstance(values, Number) == False:
        return np.NaN

then I use the .apply function on the column:

df['some_column'].apply(fill_in)

I expected:

index     some_column
 0          12345
 1          23549
 2          NaN
 3          78516
 4          98713
 5          NaN

But instead got:

index     some_column
 0          NaN
 1          NaN
 2          NaN
 3          NaN
 4          NaN
 5          NaN

Can someone please explain to me why I thought wrong?

Answer 1

Your function supplied to apply must have a return value for all inputs. In your case, there is no return value if the if test fails.

In your case when pandas does not get a value returned from the function, it makes up the output as NaN since it has nothing to put there.

Adding that negative test return value should get you the desired output.

def fill_in(value):
    if isinstance(value, Number) == False:
        return np.NaN
    else:
        return value