[英]pandas: slice dataframe based on NaN
I have following dataframe df
我有以下数据框
df
prod_id prod_ref
10 ef3920
12 bovjhd
NaN lkbljb
NaN jknnkn
30 kbknkn
I am trying the following:我正在尝试以下操作:
df[df['prod_id'] != np.nan]
but I get exactly the same dataframe.但我得到完全相同的数据框。
I would like to display我想显示
prod_id prod_ref
10 ef3920
12 bovjhd
30 kbknkn
What am I doing wrong?我究竟做错了什么?
Use function notna
or inverting isna
:使用函数
notna
或反转isna
:
print (df[df.prod_id.notna()])
prod_id prod_ref
0 10.0 ef3920
1 12.0 bovjhd
4 30.0 kbknkn
print (df[~df.prod_id.isna()])
prod_id prod_ref
0 10.0 ef3920
1 12.0 bovjhd
4 30.0 kbknkn
Another solution is dropna
, but need specify column for check NaN
:另一种解决方案是
dropna
,但需要指定检查NaN
的列:
print (df.dropna(subset=['prod_id']))
prod_id prod_ref
0 10.0 ef3920
1 12.0 bovjhd
4 30.0 kbknkn
If in another columns are not NaN
values, use Alberto Garcia-Raboso's solution .如果在另一列中不是
NaN
值,请使用Alberto Garcia-Raboso 的解决方案。
The problem is that np.nan != np.nan
is True
(alternatively, np.nan == np.nan
is False
).问题是
np.nan != np.nan
是True
(或者, np.nan == np.nan
是False
)。 Pandas provides the .dropna()
method to do what you want: Pandas 提供了
.dropna()
方法来做你想做的事:
df.dropna()
Output:输出:
prod_id prod_ref
0 10.0 ef3920
1 12.0 bovjhd
4 30.0 kbknkn
By default, .dropna()
will drop any row that has a NaN
in any column.默认情况下,
.dropna()
将删除任何列中包含NaN
的任何行。 You can tweak this behavior in two ways:您可以通过两种方式调整此行为:
subset
argument, andsubset
参数仅检查某些列,并且NaN
in all columns (in the subset
, if you are using it) using how='all'
— the default is how='any'
.how='all'
要求该行在所有列中包含NaN
(在subset
,如果您正在使用它) - 默认值为how='any'
。 You can check the documentation .您可以查看文档。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.