[英]Pandas dropna() not working (it definitely isn't the common reasons why!)
I have this dataframe:我有这个 dataframe:
There are many NaNs that were somehow produced when transforming data:转换数据时会以某种方式产生许多 NaN:
So I try to drop them using:所以我尝试使用以下方法删除它们:
df = df.dropna(how='all')
I still just get this (I know I'm only showing 3 columns, but all the columns are filled with NaNs)我仍然得到这个(我知道我只显示 3 列,但所有列都填充了 NaN)
I've tried assuming their string and using:我试过假设他们的字符串并使用:
df = df[~df.isin(['NaN']).any(axis=1)]
This also didn't work.这也没有奏效。 Any other thoughts or ideas?
还有其他想法或想法吗?
When you slice with a Boolean DataFrame the logic used is where
.当您使用 Boolean DataFrame切片时, 使用的逻辑是
where
。 That is, where the mask is True
it returns the value, where the mask is False
it by default chooses np.NaN
.也就是说,在掩码为
True
的情况下,它返回值,在掩码为False
的情况下,它默认选择np.NaN
。
Thus, if you are slicing with df.isna()
by definition you NaN
everything .因此,如果您根据定义使用
df.isna()
进行切片,那么您就是NaN
Everything 。 This is because where df.isna()
is True it passes the value ( NaN
) and where the df was not null where
passes NaN
.这是因为
df.isna()
是 True 它传递值( NaN
),而 df 不是 null 传递NaN
的where
。
import pandas as pd
import numpy as np
df = pd.DataFrame({'foo': np.NaN, 'bar': np.NaN, 'baz': np.NaN, 'boo': 1}, index=['A'])
# foo bar baz boo
#A NaN NaN NaN 1
df.isnull()
# foo bar baz boo
#A True True True False
df[df.isnull()]
# foo bar baz boo
#A NaN NaN NaN NaN
df.where(df.isnull())
# foo bar baz boo
#A NaN NaN NaN NaN
So you don't have rows full of NaN
, your mask just guarantees every cell becomes NaN
.所以你没有充满
NaN
的行,你的掩码只是保证每个单元格都变成NaN
。 If you want to inspect rows that are NaN
without modifying the values you can display rows with at least 1 NaN
:如果要在不修改值的情况下检查为
NaN
的行,则可以显示至少为 1 NaN
的行:
df[df.isnull().any(1)]
# foo bar baz boo
#A NaN NaN NaN 1
Or to see the distribution of NaN
across the rows take the value counts of the sum across rows.或者要查看
NaN
的分布,取跨行总和的值计数。 This shows we have 1 row with 3 null values.这表明我们有 1 行具有 3 个 null 值。
df.isnull().sum(1).value_counts()
#3 1
#dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.