简体   繁体   中英

how to use str.startswith for multiple columns?

I have a dataframe that looks like this: my data

I used this to filter for users who's id's begin with b, c, e, f, 5 and was able to successfully execute this.

df[df.userA.str.startswith(('b','c','e','f','5'))]

I now want to do the same for columns userA and userB, and tried running this unsuccessfully:

df[[df.userA.str.startswith(('b','c','e','f','5'))] and [df.userB.str.startswith(('b','c','e','f','5'))]]

Any ideas?

You can not use and since in Python this will return the first operand that has truthiness False (or in case there is no such operand in an and chain, the last element).

You can however use the & and | operators as a logical and and or respectively to apply multiple conditions.

So for your case, you probably want to use:

df[
    df.userA.str.startswith(('b','c','e','f','5')) 
    df.userB.str.startswith(('b','c','e','f','5'))
]

(this gives the "rows" of the dataframe df for which both userA and userB start with a character in ('b','c','e','f','5') ); or

df[
    df.userA.str.startswith(('b','c','e','f','5')) 
    df.userB.str.startswith(('b','c','e','f','5'))
]

(this gives the "rows" of the dataframe df for which at least userA or userB start with a character in ('b','c','e','f','5') )

For more information, see the documentation on Boolean indexing in the pandas documentation .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM