Pandas dropna() 不起作用（这绝对不是常见的原因！）

Question

I have this dataframe:我有这个 dataframe：

There are many NaNs that were somehow produced when transforming data:转换数据时会以某种方式产生许多 NaN：

So I try to drop them using:所以我尝试使用以下方法删除它们：

df = df.dropna(how='all')

I still just get this (I know I'm only showing 3 columns, but all the columns are filled with NaNs)我仍然得到这个（我知道我只显示 3 列，但所有列都填充了 NaN）

I've tried assuming their string and using:我试过假设他们的字符串并使用：

df = df[~df.isin(['NaN']).any(axis=1)]

This also didn't work.这也没有奏效。 Any other thoughts or ideas?还有其他想法或想法吗？

Answer 1

When you slice with a Boolean DataFrame the logic used is where .当您使用 Boolean DataFrame切片时，使用的逻辑是where 。 That is, where the mask is True it returns the value, where the mask is False it by default chooses np.NaN .也就是说，在掩码为True的情况下，它返回值，在掩码为False的情况下，它默认选择np.NaN 。

Thus, if you are slicing with df.isna() by definition you NaN everything .因此，如果您根据定义使用df.isna()进行切片，那么您就是NaN Everything 。 This is because where df.isna() is True it passes the value ( NaN ) and where the df was not null where passes NaN .这是因为df.isna()是 True 它传递值（ NaN ），而 df 不是 null 传递NaN的where 。

import pandas as pd
import numpy as np

df = pd.DataFrame({'foo': np.NaN, 'bar': np.NaN, 'baz': np.NaN, 'boo': 1}, index=['A'])
#   foo  bar  baz  boo
#A  NaN  NaN  NaN    1

df.isnull()
#    foo   bar   baz    boo
#A  True  True  True  False

df[df.isnull()]
#   foo  bar  baz  boo
#A  NaN  NaN  NaN  NaN

df.where(df.isnull())
#   foo  bar  baz  boo
#A  NaN  NaN  NaN  NaN

So you don't have rows full of NaN , your mask just guarantees every cell becomes NaN .所以你没有充满NaN的行，你的掩码只是保证每个单元格都变成NaN 。 If you want to inspect rows that are NaN without modifying the values you can display rows with at least 1 NaN :如果要在不修改值的情况下检查为NaN的行，则可以显示至少为 1 NaN的行：

df[df.isnull().any(1)]
#   foo  bar  baz  boo
#A  NaN  NaN  NaN    1

Or to see the distribution of NaN across the rows take the value counts of the sum across rows.或者要查看NaN的分布，取跨行总和的值计数。 This shows we have 1 row with 3 null values.这表明我们有 1 行具有 3 个 null 值。

df.isnull().sum(1).value_counts()
#3    1
#dtype: int64

Pandas dropna() 不起作用（这绝对不是常见的原因！）

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-31 17:08:17

Pandas dropna() 不起作用（这绝对不是常见的原因！）

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-31 17:08:17

解决方案1
1 已采纳 2021-03-31 17:08:17