[英]Pandas: Filter in rows that have a Null/None/NaN value in any of several specific columns
I have a csv file which has a lot of strings called "NULL"
in it, in several columns.我有一个 csv 文件,其中有很多名为
"NULL"
的字符串,在几列中。
I would like to select (filter in) rows that have a "NULL"
value in any of several specific columns.我想选择(过滤)在几个特定列中的任何一个中具有
"NULL"
值的行。
Example:例子:
["Firstname"] ["Lastname"] ["Profession"] "Jeff" "Goldblum" "NULL" "NULL" "Coltrane" "Musician" "Richard" "NULL" "Physicist"
Here, I would like to filter in (select) rows in df
that have the value "NULL"
in the column "Firstname"
or "Lastname"
– but not if the value is "NULL"
in "Profession"
.在这里,我想过滤(选择)
df
中在"Firstname"
或"Lastname"
列中具有值"NULL"
的行 - 但如果"Profession"
的值为"NULL"
,则不过滤。
This manages to filter in strings (not None
) in one column:这设法在一列中过滤字符串(不是
None
):
df = df[df["Firstname"].str.contains("NULL", case=False)]
I have however attempted to convert the "NULL"
strings to None
via:然而,我试图通过以下方式将
"NULL"
字符串转换为None
:
df = df.where((pd.notnull(df)), None)
df.columns = df.columns.str.lower()
Given the above str.contains
filtering, perhaps it's easier to filter in "NULL"
strings before converting to None
?鉴于上述
str.contains
过滤,也许在转换为None
之前过滤"NULL"
字符串更容易?
I think you need first replace
NULL
string to NaN
.我认为您首先需要
replace
NULL
字符串replace
为NaN
。 Then check all NaN
values in selected columns by isnull
and select all rows where is any
True
by boolean indexing
:然后通过
isnull
检查所选列中的所有NaN
值,并通过boolean indexing
选择any
True
的所有行:
df = df.replace("NULL", np.nan)
print (df[['Firstname','Lastname']].isnull())
Firstname Lastname
0 False False
1 True False
2 False True
print (df[df[['Firstname','Lastname']].isnull().any(1)])
Firstname Lastname Profession
1 NaN Coltrane Musician
2 Richard NaN Physicist
you can try:你可以试试:
df.replace(to_replace="NULL", value = None)
to replace all the occurence of "NULL"
to None
将所有出现的
"NULL"
替换为None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.