Why doesn't dropna seem to work on this column?

Question

I try to drop all na values in one column Filmname , but the values don't get dropped. Why? (screenshot of my result)

Here is my code:

import pandas as pd
df = read.csv....

df.dropna(subset=['Filmname'], inplace=True)
df.head()

Answer 1

By default, "na" is not considered NaN by pandas.read_csv .

You can add this as a NaN string manually via the na_values argument:

df = pd.read_csv('file.csv', na_values=['na'])

As per the docs :

na_values : scalar, str, list-like, or dict, default None

Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: '', '#N/A', '#N/AN/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan', '1.#IND', '1.#QNAN', 'N/A', 'NA', 'NULL', 'NaN', 'n/a', 'nan', 'null'.

Answer 2

It looks like what your values are in this screenshot is not "NaN" or some real error, but a parsed string of the value "na".

In order to filter out the rows with this value in this column, you can use to simply refer to the df with a condition, instead of using dropna:

df = pd.read_csv(...)
filtered_df = df[df['Filmname'] != 'na']

The condition inside may be anything, see this guide for a start

Why doesn't dropna seem to work on this column?

Question

2 answers

solution1
3 2018-06-15 14:23:29

solution2
1 2018-06-15 14:20:35

Why doesn't dropna seem to work on this column?

Question

2 answers

solution1 3 2018-06-15 14:23:29

solution2 1 2018-06-15 14:20:35

solution1
3 2018-06-15 14:23:29

solution2
1 2018-06-15 14:20:35