How can I calculate the percentage of empty values in a pandas dataframe?

Question

I have a dataframe df , from which I know there are empty values, ie '' (blank spaces). I want to calculate the percentage per column of those observations and replace them with NaN .

To get the percentage I've tried:

for col in df:
   empty = round((df[df[col]] == '').sum()/df.shape[0]*100, 1)

I have a similar code which calculates the zeros, which does work:

zeros = round((df[col] == 0).sum()/df.shape[0]*100, 1)

Answer 1

I think you need Series.isna for test missing values (but not empty spaces):

nans = round(df[col].isna().sum()/df.shape[0]*100, 1)

Solution should be simplify with mean :

nans = round(df[col].isna().mean()*100, 1)

For replace empty spaces or spaces to NaN s use:

df = df.replace(r'^\s*$', np.nan, regex=True)

nans = round(df[col].isna().mean()*100, 1)

If need test all columns:

nans = df.isna().mean().mul(100).round()

Answer 2

The full answer to your problem will be:

for col in df:
    empty_avg = round(df[col].isna().mean()*100, 1) # This line is to find the average of empty values.

df = df[df != ''] # This will replace all the empty values with NaN.

How can I calculate the percentage of empty values in a pandas dataframe?

Question

2 answers

solution1
2 ACCPTED 2021-04-07 09:56:08

solution2
1 2021-04-07 10:11:10

How can I calculate the percentage of empty values in a pandas dataframe?

Question

2 answers

solution1 2 ACCPTED 2021-04-07 09:56:08

solution2 1 2021-04-07 10:11:10

solution1
2 ACCPTED 2021-04-07 09:56:08

solution2
1 2021-04-07 10:11:10