根据多个条件删除数据框行

Question

I am trying to drop some rows from a pandas DataFrame based on 4 conditions needing to be met in the same row. 我试图根据需要在同一行中满足4个条件从pandas DataFrame中删除一些行。 So I tried the following command: 所以我尝试了以下命令：

my_data.drop(my_data[(my_data.column1 is None) & (my_data.column2 is None) & (my_data.column3 is None) & (my_data.column4 is None)].index, inplace=True)

And it throws this error: enter image description here 并引发此错误：在此处输入图像描述

I've also tried: 我也尝试过：

my_data= my_data.loc[my_data[(my_data.column1 is None) & (my_data.column2 is None) & (my_data.column3 is None) & (my_data.column4 is None), :]

but without success 但没有成功

Can i get some help please :) 我可以帮忙吗:)

I'm working on python 3.5 我正在使用python 3.5

Answer 1

Normally, a column is checked for nullness with the isnull method: 通常，使用isnull方法检查一列是否为空：

df.drop(df[df['column1'].isnull() 
          & df['column2'].isnull() 
          & df['column3'].isnull() 
          & df['column4'].isnull()].index)

However, there are more compact and idiomatic ways for that: 但是，有更紧凑和惯用的方式可以做到这一点：

df.dropna(subset=['column1', 'column2', 'column3', 'column4'], how='all')

A demo: 演示：

prng = np.random.RandomState(0)
df = pd.DataFrame(prng.randn(100, 6), columns=['column{}'.format(i) for i in range(1, 7)])

df.head()
Out: 
    column1   column2   column3   column4   column5   column6
0  1.764052  0.400157  0.978738  2.240893  1.867558 -0.977278
1  0.950088 -0.151357 -0.103219  0.410599  0.144044  1.454274
2  0.761038  0.121675  0.443863  0.333674  1.494079 -0.205158
3  0.313068 -0.854096 -2.552990  0.653619  0.864436 -0.742165
4  2.269755 -1.454366  0.045759 -0.187184  1.532779  1.469359

df = df.mask(prng.binomial(1, 0.5, df.shape).astype('bool'), np.nan)

df.head()
Out: 
    column1   column2   column3   column4   column5   column6
0       NaN  0.400157       NaN  2.240893       NaN       NaN
1  0.950088 -0.151357 -0.103219  0.410599  0.144044       NaN
2  0.761038  0.121675       NaN       NaN       NaN -0.205158
3       NaN       NaN -2.552990       NaN  0.864436       NaN
4  2.269755 -1.454366  0.045759 -0.187184       NaN       NaN

The following drops rows only if columns 1, 3, 5 and 6 are null: 仅当第1、3、5和6列为空时，以下才会删除行：

df.dropna(subset=['column1', 'column3', 'column5', 'column6'], how='all').head()
Out: 
    column1   column2   column3   column4   column5   column6
1  0.950088 -0.151357 -0.103219  0.410599  0.144044       NaN
2  0.761038  0.121675       NaN       NaN       NaN -0.205158
3       NaN       NaN -2.552990       NaN  0.864436       NaN
4  2.269755 -1.454366  0.045759 -0.187184       NaN       NaN
5  0.154947  0.378163 -0.887786 -1.980796 -0.347912       NaN

根据多个条件删除数据框行

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-04-27 16:54:47

根据多个条件删除数据框行

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-04-27 16:54:47

解决方案1
4 已采纳 2017-04-27 16:54:47