How to convert 'NaN' strings in a pandas Series to null values for dropna?

Question

I tried a couple methods to clean rows containing NaN from a particular Series in my DataFrame only to realize every NaN entry is a 'NaN' string, not a null value.

In my specific example, each row represents a country and so I want to remove all countries that do not have a GDP value in the 'GDP per Capita' column from the DataFrame.

Some things I tried (that failed):

df_noGDP = df
df_noGDP.dropna(axis=0, subset=['GDP per Capita'])

and

df_noGDP = df.loc[df['GDP per Capita'] != np.nan]

When I call df_noGDP , I see that no NaN values are removed. I figure I'm either making a silly syntax error somewhere or I need to convert my data types.

Answer 1

Do:

df_noGDP=df_noGDP.replace('NaN',np.nan)

Or:

df_noGDP.replace('NaN','np.nan,inplace=1)

Then your stuff would work as expected.

Answer 2

First convert your strings to NaN values:

df = df.replace('NaN', np.nan)

Then assign back or specify your method to be in-place:

df = df.dropna(subset=['GDP per Capita'])           # not in place version
df.dropna(subset=['GDP per Capita'], inplace=True)  # in place version

Alternatively, use loc with notnull , since NaN != NaN by design :

df = df.loc[df['GDP per Capita'].notnull()]

How to convert 'NaN' strings in a pandas Series to null values for dropna?

Question

2 answers

solution1
1 2018-12-11 03:28:26

solution2
1 ACCPTED 2018-12-11 03:30:30

How to convert 'NaN' strings in a pandas Series to null values for dropna?

Question

2 answers

solution1 1 2018-12-11 03:28:26

solution2 1 ACCPTED 2018-12-11 03:30:30

solution1
1 2018-12-11 03:28:26

solution2
1 ACCPTED 2018-12-11 03:30:30