Select rows from dataframe based on substring A or B in a column

Question

Sorry, I needed to edit my question as I'm actually looking for substrings with more than one character. The suggested answers are good, but mostly work for one character strings.

import panda as pd

test = pd.DataFrame({'A': 'ju1 j4 abjul boy noc s1 asep'.split(),
                 'B': [1, 2, 3, 4, 5, 6, 7]})
print(test)


       A  B
0    ju1  1
1     j4  2
2  abjul  3
3    boy  4
4    noc  5
5     s1  6
6   asep  7

I know I can select all the rows that contain 'ju' with

subset = test[test['A'].str.contains('ju')]
print(subset)

       A  B
0    ju1  1
1  abjul  3

Is there an elegant way to select all rows that contain either 'ju' or 'as'?

This works as suggested below, are there other ways that also work?

ju = test.A.str.contains('ju')
as = test.A.str.contains('as')
subset = test[ju | as]

Answer 1

In [13]: test.loc[test.A.str.contains(r'[js]')]
Out[13]:
       A  B
0     j1  1
1     j4  2
2  abjul  3
5     s1  6
6   asep  7

Answer 2

option 1
try using str.match

test[test.A.str.match('.*[js].*')]

option 2
set operations

s = test.A.apply(set)
test[s.sub(set(list('js'))).lt(s)]

option 3
set operations with numpy broadcasting

s = test.A.apply(set)
test[(~(np.array([[set(['j'])], [set(['s'])]]) - s.values).astype(bool)).any(0)]

option 4
separate conditions

cond_j = test.A.str.contains('j')
cond_s = test.A.str.contains('s')
test[cond_j | cond_s]

All yield

       A  B
0     j1  1
1     j4  2
2  abjul  3
5     s1  6
6   asep  7

time testing

Select rows from dataframe based on substring A or B in a column

Question

2 answers

solution1
2 2017-01-18 19:31:27

solution2
1 ACCPTED 2017-01-18 19:23:35

Select rows from dataframe based on substring A or B in a column

Question

2 answers

solution1 2 2017-01-18 19:31:27

solution2 1 ACCPTED 2017-01-18 19:23:35

solution1
2 2017-01-18 19:31:27

solution2
1 ACCPTED 2017-01-18 19:23:35