简体   繁体   中英

Isolating pandas columns using boolean logic in python

I am trying to grab the rows of a data frame that satisfy one or both of the following boolean statements:

1) df['colName'] == 0
2) df['colName'] == 1

I've tried these, but neither one works (throws errors):

df = df[df['colName']==0 or df['colName']==1]
df = df[df['colName']==0 | df['colName']==1]

Any other ideas?

you are missing ()

df = df[(df['colName']==0) | (df['colName']==1)]

this will probably raise a copy warning but will still works.

if you don't want the copy warning, use an indexer such has:

indexer = df[(df['colName']==0) | (df['colName']==1)].index
df = df.loc[indexer,:]

You could clean up what you've done using eq instead of ==

df[df.colName.eq(0) | df.colName.eq(1)]

For this case, I recommend using isin

df[df.colName.isin([0, 1])]

Using query also works but is slower

df.query('colName in [0, 1]')

Timing
isin is quickest on df defined below
df = pd.DataFrame(np.random.randint(3, size=10000), columns=['colName'])

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM