如果特定列包含 null 值，如何从 dataframe 中删除行？

Question

I need to delete rows from a dataframe if specific columns contains null values:如果特定列包含 null 值，我需要从 dataframe 中删除行：

-> In this example, if col2 and col3 are null: -> 在这个例子中，如果 col2 和 col3 是 null：

import pandas as pd

obj = {'col1': [1, 2,7,47,12,67,58], 'col2': [741, 332,7,'Nan',127,'Nan',548],  'col3': ['Nan', 2,74,'Nan',127,'Nan',548] }

df = pd.DataFrame(data=obj)



df.head()
    col1 col2   col3
0   1    741    Nan
1   2    332    2
2   7    7      74
3   47   Nan    Nan
4   12   127    127
5   67   Nan    Nan
6   58   548    548

After delete, the result should be:删除后，结果应该是：

df.head()
        col1 col2   col3
    0   1    741    Nan
    1   2    332    2
    2   7    7      74
    4   12   127    127
    6   58   548    548

Thanks for all!谢谢大家！

Answer 1

Use Boolean indexing with DataFrame.isna or DataFrame.isnull to check NaN or Null values.使用Boolean indexing和DataFrame.isna或DataFrame.isnull值来检查 NaN 或 ZBBB93CDD216E3C18014Z10B1 Select the maximum number of NaN allowed per rows with DataFrame.sum and Series.le : Select DataFrame.sum和Series.le每行允许的最大NaN数：

df=df.replace('Nan',np.nan)
new_df=df[df.isnull().sum(axis=1).le(1)]
print(new_df)

   col1   col2   col3
0     1  741.0    NaN
1     2  332.0    2.0
2     7    7.0   74.0
4    12  127.0  127.0
6    58  548.0  548.0

To specifict columns:要指定列：

DataFrame.all DataFrame.all

df=df.replace('Nan',np.nan)
df_filtered=df[~df[['col2','col3']].isnull().all(axis=1)]
print(df_filtered)

   col1   col2   col3
0     1  741.0    NaN
1     2  332.0    2.0
2     7    7.0   74.0
4    12  127.0  127.0
6    58  548.0  548.0

Answer 2

Using dropna使用dropna

axis = 0 to delete rows, thresh=1 has the number of non-null values required to drop the row. axis = 0删除行， thresh=1具有删除行所需的非空值的数量。

You can use subset=['col2', 'col3'] if you want to define the columns on which the as the basis of dropping rows.如果要定义作为删除行基础的列，可以使用subset=['col2', 'col3'] 。

You can try this:你可以试试这个：

df = df.dropna(axis=0, subset=['col2', 'col3'], how="any", thresh=1)

Answer 3

After deploying the solution proposed by @ansev, everything worked:部署@ansev 提出的解决方案后，一切正常：

import pandas as pd

obj = {'col1': [1, 2,7,47,12,67,58], 'col2': [741, 332,7,'Nan',127,'Nan',548],  'col3': ['Nan', 2,74,'Nan',127,'Nan',548] }

df = pd.DataFrame(data=obj)

df=df.replace('Nan',np.nan)
df_filtered=df[~df[['col2','col3']].isnull().all(axis=1)]

print(df_filtered)

col1   col2   col3
0     1  741.0    NaN
1     2  332.0    2.0
2     7    7.0   74.0
4    12  127.0  127.0
6    58  548.0  548.0

如果特定列包含 null 值，如何从 dataframe 中删除行？

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-10-30 16:44:42

DataFrame.all DataFrame.all

解决方案2
1 2019-10-30 16:45:27

解决方案3
0 2019-10-30 18:24:59

如果特定列包含 null 值，如何从 dataframe 中删除行？

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-10-30 16:44:42

DataFrame.all DataFrame.all

解决方案2 1 2019-10-30 16:45:27

解决方案3 0 2019-10-30 18:24:59

解决方案1
1 已采纳 2019-10-30 16:44:42

解决方案2
1 2019-10-30 16:45:27

解决方案3
0 2019-10-30 18:24:59