简体   繁体   中英

Delete all rows containing all zeros after a certain column number

In pandas dataframe how do I delete all rows which have zeros after a certain column. For example

from pandas import DataFrame
df = DataFrame({'a' : [0,1,1,0,0,0,0], 'b' : [0,1,-1, 1,0,0,0], 'c': [1,4,5,6,7,0,0]}).T

df:

    0   1   2   3   4   5   6
a   0   1   1   0   0   0   0
b   0   1   -1  1   0   0   0
c   1   4   5   6   7   0   0

How do I drop rows containing all values as zero after column 3? The first and second rows (index a and b ) in this example are to be dropped.

You can subscript the columns, replace 0 with NaN , drop any rows that don't have at least 1 non NaN value and use loc on the index:

In [63]:
df.loc[df[df.columns[4:]].replace(0, NaN).dropna(thresh=1).index]
Out[63]:
   0  1  2  3  4  5  6
c  1  4  5  6  7  0  0

So breaking this down:

In [64]:
df[df.columns[4:]]

Out[64]:
   4  5  6
a  0  0  0
b  0  0  0
c  7  0  0

In [66]:   
df[df.columns[4:]].replace(0, NaN)

Out[66]:
    4   5   6
a NaN NaN NaN
b NaN NaN NaN
c   7 NaN NaN

In [67]:    
df[df.columns[4:]].replace(0, NaN).dropna(thresh=1)

Out[67]:
   4   5   6
c  7 NaN NaN

In [68]:    
df[df.columns[4:]].replace(0, NaN).dropna(thresh=1).index

Out[68]:
Index(['c'], dtype='object')

Update Actually a more concise way:

In [77]:

df[any(df[df.columns[4:]] != 0, axis=1)]
Out[77]:
   0  1  2  3  4  5  6
c  1  4  5  6  7  0  0

如果您有任意数量的列,则cna总是这样做:

df[ df.ix[:, 4:].T.abs().sum() != 0 ]
df[(df[4] != 0) | (df[5] != 0) | (df[6] != 0)] 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM