Using Python's `in`-operator in Pandas dataframe .loc

Question

I'm working with a some Pandas dataframes and I can't quite get why some boolean operators are allowed and work in the .loc -selector and others give an error. To be precise, let's take the following dataframe:

import pandas as pd
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two thr two two one thr'.split()})

Now both 'two' == 'two' and 'w' in 'two' evaluate as True , but when used with df.loc[...] the following works:

df.loc[df['B'] == 'two']

printing out

         A       B
    2   foo     two
    4   foo     two
    5   bar     two

But the following raises a KeyError: False -error.

df.loc['w' in df['B']]

I know ways to work around this, but none of them feel particularly smooth, and even worse I don't understand at all why the 'w' in df['B'] -selector is not allowed in .loc .

Answer 1

Have a look at the output of df['B'] == 'two and compare it to the output of 'w' in df['B'] . The first one will output a panda Series containing either True or False for each row in df['B'] . The second one will output False .

The .loc operator can take "A boolean array of the same length as the axis being sliced, eg [True, False, True]" (see .loc documentation ). You obtain the KeyError: False because .loc tries to find False which is neither a column nor a row name.

To use the w in df['B'] -expression you could do:

list_true_false = ['w' in entry for entry in df['B']]`

df.loc[list_true_false]`

Hope that helps!

Answer 2

You need the isin operator or the contains function

df.loc[df['B'].isin(['two'])] # to match the full word specify it as list
df.loc[df['B'].str.contains('w')] # to match the pattern or a letter

Using Python's `in`-operator in Pandas dataframe .loc

Question

2 answers

solution1
4 ACCPTED 2020-07-31 13:22:24

solution2
2 2020-07-31 10:34:43

Using Python's `in`-operator in Pandas dataframe .loc

Question

2 answers

solution1 4 ACCPTED 2020-07-31 13:22:24

solution2 2 2020-07-31 10:34:43

solution1
4 ACCPTED 2020-07-31 13:22:24

solution2
2 2020-07-31 10:34:43