根据列中的子字符串A或B从数据框中选择行

Question

Sorry, I needed to edit my question as I'm actually looking for substrings with more than one character. 抱歉，由于我实际上正在查找具有多个字符的子字符串，因此我需要编辑问题。 The suggested answers are good, but mostly work for one character strings. 建议的答案很好，但大多数情况下只适用于一个字符串。

import panda as pd

test = pd.DataFrame({'A': 'ju1 j4 abjul boy noc s1 asep'.split(),
                 'B': [1, 2, 3, 4, 5, 6, 7]})
print(test)


       A  B
0    ju1  1
1     j4  2
2  abjul  3
3    boy  4
4    noc  5
5     s1  6
6   asep  7

I know I can select all the rows that contain 'ju' with 我知道我可以选择所有包含'ju'的行

subset = test[test['A'].str.contains('ju')]
print(subset)

       A  B
0    ju1  1
1  abjul  3

Is there an elegant way to select all rows that contain either 'ju' or 'as'? 有没有一种优雅的方法来选择所有包含'ju'或'as'的行？

This works as suggested below, are there other ways that also work? 如下所示，这可行，还有其他方法也可行吗？

ju = test.A.str.contains('ju')
as = test.A.str.contains('as')
subset = test[ju | as]

Answer 1

In [13]: test.loc[test.A.str.contains(r'[js]')]
Out[13]:
       A  B
0     j1  1
1     j4  2
2  abjul  3
5     s1  6
6   asep  7

Answer 2

option 1 选项1
try using str.match 尝试使用str.match

test[test.A.str.match('.*[js].*')]

option 2 选项2
set operations set操作

s = test.A.apply(set)
test[s.sub(set(list('js'))).lt(s)]

option 3 选项3
set operations with numpy broadcasting 通过numpy广播set操作

s = test.A.apply(set)
test[(~(np.array([[set(['j'])], [set(['s'])]]) - s.values).astype(bool)).any(0)]

option 4 选项4
separate conditions 分开的条件

cond_j = test.A.str.contains('j')
cond_s = test.A.str.contains('s')
test[cond_j | cond_s]

All yield 所有产量

       A  B
0     j1  1
1     j4  2
2  abjul  3
5     s1  6
6   asep  7

time testing 时间测试

根据列中的子字符串A或B从数据框中选择行

问题描述

2 个解决方案

解决方案1
2 2017-01-18 19:31:27

解决方案2
1 已采纳 2017-01-18 19:23:35

根据列中的子字符串A或B从数据框中选择行

问题描述

2 个解决方案

解决方案1 2 2017-01-18 19:31:27

解决方案2 1 已采纳 2017-01-18 19:23:35

解决方案1
2 2017-01-18 19:31:27

解决方案2
1 已采纳 2017-01-18 19:23:35