Pandas - 使用.isnull（），notnull（），dropna（）删除缺少数据的行

Question

This is really weird. 这真的很奇怪。 I have tried several ways of dropping rows with missing data from a pandas dataframe, but none of them seem to work. 我已经尝试了几种方法从pandas数据帧中删除丢失数据的行，但它们似乎都没有工作。 This is the code (I just uncomment one of the methods used - but these are the three that I used in different modifications - this is the latest): 这是代码（我只是取消注释使用的方法之一 - 但这些是我在不同的修改中使用的三个 - 这是最新的）：

import pandas as pd
Test = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,'NaN',4,5],'C':[1,2,3,'NaT',5]})
print(Test)
#Test = Test.ix[Test.C.notnull()]
#Test = Test.dropna()
Test = Test[~Test[Test.columns.values].isnull()]
print "And now"
print(Test)

But in all cases, all I get is this: 但在所有情况下，我得到的只是：

   A    B    C
0  1    1    1
1  2    2    2
2  3  NaN    3
3  4    4  NaT
4  5    5    5
And now
   A    B    C
0  1    1    1
1  2    2    2
2  3  NaN    3
3  4    4  NaT
4  5    5    5

Is there any mistake that I am making? 我有什么错误吗？ or what is the problem? 或者问题是什么？ Ideally, I would like to get this: 理想情况下，我想得到这个：

   A    B    C
0  1    1    1
1  2    2    2
4  5    5    5

Answer 1

Your example DF has NaN and NaT as strings which .dropna , .notnull and co. 你的例子DF有NaN和NaT作为字符串.dropna ， .notnull和co。 won't consider falsey, so given your example you can use... 不会考虑假，所以根据你的例子，你可以使用......

df[~df.isin(['NaN', 'NaT']).any(axis=1)]

Which gives you: 哪个给你：

If you had a DF such as (note of the use of np.nan and np.datetime64('NaT') instead of strings: 如果你有一个DF，比如（使用np.nan和np.datetime64('NaT')而不是字符串：

df = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,np.nan,4,5],'C':[1,2,3,np.datetime64('NaT'),5]})

Then running df.dropna() which give you: 然后运行df.dropna() ，它给你：

   A    B  C
0  1  1.0  1
1  2  2.0  2
4  5  5.0  5

Note that column B is now a float instead of an integer as that's required to store NaN values. 请注意，列B现在是float而不是整数，因为存储NaN值是必需的。

Answer 2

Try this on orig data: 在orig数据上试试这个：

Test.replace(["NaN", 'NaT'], np.nan, inplace = True)
Test = Test.dropna()
Test

Or Modify data and do this 或修改数据并执行此操作

import pandas as pd
import numpy as np 

Test = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,np.nan,4,5],'C':[1,2,3,pd.NaT,5]})
print(Test)
Test = Test.dropna()
print(Test)



   A    B  C
0  1  1.0  1
1  2  2.0  2
4  5  5.0  5

Pandas - 使用.isnull（），notnull（），dropna（）删除缺少数据的行

问题描述

2 个解决方案

解决方案1
14 已采纳 2016-09-06 03:02:02

解决方案2
10 2016-09-06 03:17:34

Pandas - 使用.isnull（），notnull（），dropna（）删除缺少数据的行

问题描述

2 个解决方案

解决方案1 14 已采纳 2016-09-06 03:02:02

解决方案2 10 2016-09-06 03:17:34

解决方案1
14 已采纳 2016-09-06 03:02:02

解决方案2
10 2016-09-06 03:17:34