简体   繁体   中英

Python Pandas: get rows of a DataFrame where a column is not null

I'm filtering my DataFrame dropping those rows in which the cell value of a specific column is None.

df = df[df['my_col'].isnull() == False]

Works fine, but PyCharm tells me:

PEP8: comparison to False should be 'if cond is False:' or 'if not cond:'

But I wonder how I should apply this to my use-case? Using 'not ...' or ' is False' did not work. My current solution is:

df = df[df['my_col'].notnull()]

So python has the short-circuiting logic operators not , and , or . These have a very specific meaning in python and cannot be overridden ( not must return a bool and a and/or b always returns either a or b or throws an error.

However, python also has over-loadable boolean operators ~ (not), & (and), | (or) and ^ (xor).

You may recognise these as the int bitwise operators, but Numpy (and therefore pandas) use these to do array / series boolean operations.

For example

b = np.array([True, False, True]) & np.array([True, False, False])
# b --> [True False False]
b = ~b 
# b --> [False True True]

Hence what you want is

df = df[~df['my_col'].isnull()]

I agree with PEP8, don't do == False .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM