remove specific rows in dataframe with pandas

Question

i need some help from all of you I'm working with a data form from excel, so basically now i have something like this.

csr   id   ac  otc  tm  lease  maint 
 1    456  b    0    0    0      0
 1    543  a    0    1    1      0
 1    435  e    0    0    0      0
 2    123  w    1    1    1      1
 2    123  g    0    0    0      0
 3    987  j    0    0    0      0
 4    258  k    1    1    1      1 
 4    258  m    0    0    0      0

So i need to delete the rows with zero in 'otc' 'tm' 'lease' 'maint' columns. i do something like this

df = pd.read_excel(xlsx,'Sheet1')
df_zero = df[(df['OTC'] == 0) & (df['TM'] == 0) & (df['Lease'] == 0) & (df['Maint'] == 0) & (df['Support'] == 0) & (df['Other']  == 0)]

In this way i open the file and save in df_zero all the rows that contain zero in the specific column. Then

df1 = df_zero.loc[:, 'CSR']

Basically this save in df1 the CSR number for the rows with zeros in the specific columns, like this

So i think ok i do this and problem resolved.

for n1 in df1:
    df = df[df.CSR != n1]

But the problem here is, as you can see in the CSR 1, we have 3 different rows, if i run that 'for', i will delete the 3 of them, i just need to remove the ones that have zeros in the specific columns ('otc' 'tm' 'lease' 'maint').

I think in a 'for' for be moving in the CSR and another one to be moving in 'otc' if the value that i found is zero move to 'tm'(in the same row) check for zero, then to 'lease' and 'maint' in the same row, if any of this columns is not zero, jump to the next CSR. In this example. We will remove the CSR 1, because all of them ('otc' 'tm' 'lease' 'maint') are zero, then jump to the next CSR, again 1, but in this case we have zero in 'otc' but 1 in 'tm', so we have to jump to the next CSR is again 1 but all of the columns ('otc' 'tm' 'lease' 'maint') are in zero so we remove the row, and continue in this way until the last CSR...

I think that could work but i'm having some problems to implement that, or maybe any of you have a better idea. Thanks and sorry for bad english

Answer 1

You can also extract the four columns that you are interested and count how many zeros it has for each row and create logical vector for indexing:

df[(df[['otc', 'tm', 'lease', 'maint']] == 0).sum(axis = 1) < 4]

#  csr   id  ac  otc    tm  lease   maint
# 1  1  543   a    0     1      1       0
# 3  2  123   w    1     1      1       1
# 6  4  258   k    1     1      1       1

Answer 2

try this:

In [35]: df.eval('otc == 0 and tm == 0 and lease == 0 and maint == 0')
Out[35]:
0     True
1    False
2     True
3    False
4     True
5     True
6    False
7     True
dtype: bool

In [36]: df[~df.eval('otc == 0 and tm == 0 and lease == 0 and maint == 0')]
Out[36]:
   csr   id ac  otc  tm  lease  maint
1    1  543  a    0   1      1      0
3    2  123  w    1   1      1      1
6    4  258  k    1   1      1      1

remove specific rows in dataframe with pandas

Question

2 answers

solution1
2 ACCPTED 2016-09-07 20:45:31

solution2
1 2016-09-07 20:44:36

remove specific rows in dataframe with pandas

Question

2 answers

solution1 2 ACCPTED 2016-09-07 20:45:31

solution2 1 2016-09-07 20:44:36

solution1
2 ACCPTED 2016-09-07 20:45:31

solution2
1 2016-09-07 20:44:36