how to use str.startswith for multiple columns?

Question

I have a dataframe that looks like this: my data

I used this to filter for users who's id's begin with b, c, e, f, 5 and was able to successfully execute this.

df[df.userA.str.startswith(('b','c','e','f','5'))]

I now want to do the same for columns userA and userB, and tried running this unsuccessfully:

df[[df.userA.str.startswith(('b','c','e','f','5'))] and [df.userB.str.startswith(('b','c','e','f','5'))]]

Any ideas?

Answer 1

You can not use and since in Python this will return the first operand that has truthiness False (or in case there is no such operand in an and chain, the last element).

You can however use the & and | operators as a logical and and or respectively to apply multiple conditions.

So for your case, you probably want to use:

df[
    df.userA.str.startswith(('b','c','e','f','5')) 
    df.userB.str.startswith(('b','c','e','f','5'))
]

(this gives the "rows" of the dataframe df for which both userA and userB start with a character in ('b','c','e','f','5') ); or

df[
    df.userA.str.startswith(('b','c','e','f','5')) 
    df.userB.str.startswith(('b','c','e','f','5'))
]

(this gives the "rows" of the dataframe df for which at least userA or userB start with a character in ('b','c','e','f','5') )

For more information, see the documentation on Boolean indexing in the pandas documentation .

how to use str.startswith for multiple columns?

Question

1 answers

solution1
0 ACCPTED 2018-09-22 15:28:47

how to use str.startswith for multiple columns?

Question

1 answers

solution1 0 ACCPTED 2018-09-22 15:28:47

solution1
0 ACCPTED 2018-09-22 15:28:47