Pandas：过滤在几个特定列中的任何一个中具有 Null/None/NaN 值的行

Question

I have a csv file which has a lot of strings called "NULL" in it, in several columns.我有一个 csv 文件，其中有很多名为"NULL"的字符串，在几列中。

I would like to select (filter in) rows that have a "NULL" value in any of several specific columns.我想选择（过滤）在几个特定列中的任何一个中具有"NULL"值的行。

Example:例子：

 ["Firstname"] ["Lastname"] ["Profession"] "Jeff" "Goldblum" "NULL" "NULL" "Coltrane" "Musician" "Richard" "NULL" "Physicist"

Here, I would like to filter in (select) rows in df that have the value "NULL" in the column "Firstname" or "Lastname" – but not if the value is "NULL" in "Profession" .在这里，我想过滤（选择） df中在"Firstname"或"Lastname"列中具有值"NULL"的行 - 但如果"Profession"的值为"NULL" ，则不过滤。

This manages to filter in strings (not None ) in one column:这设法在一列中过滤字符串（不是None ）：

df = df[df["Firstname"].str.contains("NULL", case=False)]

I have however attempted to convert the "NULL" strings to None via:然而，我试图通过以下方式将"NULL"字符串转换为None ：

df = df.where((pd.notnull(df)), None)
df.columns = df.columns.str.lower()

Given the above str.contains filtering, perhaps it's easier to filter in "NULL" strings before converting to None ?鉴于上述str.contains过滤，也许在转换为None之前过滤"NULL"字符串更容易？

Answer 1

I think you need first replace NULL string to NaN .我认为您首先需要replace NULL字符串replace为NaN 。 Then check all NaN values in selected columns by isnull and select all rows where is any True by boolean indexing :然后通过isnull检查所选列中的所有NaN值，并通过boolean indexing选择any True的所有行：

df = df.replace("NULL", np.nan)

print (df[['Firstname','Lastname']].isnull())
  Firstname Lastname
0     False    False
1      True    False
2     False     True

print (df[df[['Firstname','Lastname']].isnull().any(1)])
  Firstname  Lastname Profession
1       NaN  Coltrane   Musician
2   Richard       NaN  Physicist

Answer 2

you can try:你可以试试：

df.replace(to_replace="NULL", value = None)

to replace all the occurence of "NULL" to None将所有出现的"NULL"替换为None

Pandas：过滤在几个特定列中的任何一个中具有 Null/None/NaN 值的行

问题描述

2 个解决方案

解决方案1
5 已采纳 2016-10-04 11:09:00

解决方案2
1 2016-10-04 11:07:07

Pandas：过滤在几个特定列中的任何一个中具有 Null/None/NaN 值的行

问题描述

2 个解决方案

解决方案1 5 已采纳 2016-10-04 11:09:00

解决方案2 1 2016-10-04 11:07:07

解决方案1
5 已采纳 2016-10-04 11:09:00

解决方案2
1 2016-10-04 11:07:07