简体   繁体   中英

Python equivalent of excel nested if condition for filtering Pandas DataFrame rows

Selecting specific excel rows using python. So in excel I would do

If(And(Or(A<>({"Closed",""}),Or(B<>({"Closed",""})))

For obtaining those columns in a data frame that is neither Closed or blank. Tried using

df = df[(~df.A.isin([Closed","No Data"])) &(~df.B.isin([Closed","No Data"]))]

The problem is python is removing columns which are for example:

A                        B
Approved       Closed
No Data          Restrict
Restrict           No Data

Which I don't want As suggested in one of the links also tried

df.loc[(df[A] != "Closed") & (df[B] != "Closed") & (df[A] != "No data") & (df[B] != "No data")

Got the same result as when I tried .isin

I will use this sample data:

           A         B  ~df.A.isin  ~df.B.isin  ~A & ~B  ~A | ~B
0     Closed    Closed       False       False    False    False
1     Closed   No Data       False       False    False    False
2   Approved    Closed        True       False    False     True
3    No Data   No Data       False       False    False    False
4     Closed  Approved       False        True    False     True
5    No Data  Restrict       False        True    False     True
6   Approved   No Data        True       False    False     True
7     Closed  Restrict       False        True    False     True
8   Approved  Approved        True        True     True     True
9    No Data  Approved       False        True    False     True
10  Restrict   No Data        True       False    False     True
11  Restrict  Approved        True        True     True     True

~df.A.isin column shows the value of ~df.A.isin(["Closed","No Data"]) , which is True for rows where A contains neither Closed nor No Data

~df.B.isin column shows the value of ~df.B.isin(["Closed","No Data"]) , which is True for rows where B contains neither Closed nor No Data

~A & ~B column shows the value of (~df.A.isin(["Closed","No Data"])) &(~df.B.isin(["Closed","No Data"]))

~A | ~B ~A | ~B column shows the value of (~df.A.isin(["Closed","No Data"])) |(~df.B.isin(["Closed","No Data"]))

You first attempt lacks a " at the beginning of Closed" . Adding it we have

df[(~df.A.isin(["Closed","No Data"])) &(~df.B.isin(["Closed","No Data"]))]

which gives us:

           A         B  ~df.A.isin  ~df.B.isin  ~A & ~B  ~A | ~B
8   Approved  Approved        True        True     True     True
11  Restrict  Approved        True        True     True     True

The result shows only those rows that are completely without Closed and without No Data .

The suggestion in comments by Wen-Ben:

df[(~df.A.isin(["Closed","No Data"])) |(~df.B.isin(["Closed","No Data"]))]

gives us:

           A         B  ~df.A.isin  ~df.B.isin  ~A & ~B  ~A | ~B
2   Approved    Closed        True       False    False     True
4     Closed  Approved       False        True    False     True
5    No Data  Restrict       False        True    False     True
6   Approved   No Data        True       False    False     True
7     Closed  Restrict       False        True    False     True
8   Approved  Approved        True        True     True     True
9    No Data  Approved       False        True    False     True
10  Restrict   No Data        True       False    False     True
11  Restrict  Approved        True        True     True     True

Here we have | ( or ) instead of & ( and ), so the rows can contain Closed or No Data , but not in both A and B. This means the rows that have:

       A         B
Approved    Closed
 No Data  Restrict
Restrict   No Data

will be included, but not rows that have:

     A         B
Closed    Closed
Closed   No Data

Your second attempt:

df.loc[(df[A] != "Closed") & (df[B] != "Closed") &
       (df[A] != "No data") & (df[B] != "No data")

needs quotes around column labels. You can either use df.A or df['A'] , but not df[A]

Also, you spelled data in No data with lowercase d , while in other places you have it with uppercase D - No Data . In python, that's not the same. If we fix that:

df.loc[(df['A'] != "Closed") & (df['B'] != "Closed") &
       (df['A'] != "No Data") & (df['B'] != "No Data")]

which gives us the same thing as the first attempt:

           A         B  ~df.A.isin  ~df.B.isin  ~A & ~B  ~A | ~B
8   Approved  Approved  True  True  True  True  True  True  True
11  Restrict  Approved  True  True  True  True  True  True  True

If you rearrange this expression a little, use parentheses and | ( or ):

df.loc[((df['A'] != "Closed") & (df['A'] != "No Data")) | 
       ((df['B'] != "Closed") & (df['B'] != "No Data"))]

we get:

           A         B  ~df.A.isin  ~df.B.isin  ~A & ~B  ~A | ~B
2   Approved    Closed        True       False    False     True
4     Closed  Approved       False        True    False     True
5    No Data  Restrict       False        True    False     True
6   Approved   No Data        True       False    False     True
7     Closed  Restrict       False        True    False     True
8   Approved  Approved        True        True     True     True
9    No Data  Approved       False        True    False     True
10  Restrict   No Data        True       False    False     True
11  Restrict  Approved        True        True     True     True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM