简体   繁体   English

为什么 Pandas series.str.contains 方法在有前导空格时检测不到匹配?

[英]Why does Pandas series.str.contains method not detect match when there is a leading space?

I want to find all index values that contain the string ' (target)' .我想查找包含字符串' (target)'所有索引值。

Example:例子:

index = pd.Index(['TIC7201-PV (target)', 'TIC7202-PV', 'TIC7203-PV'])
print(index.str.contains(' (target)'))

What I get:我得到什么:

[False False False]

What I expected:我所期望的:

[ True False False]

For comparison:为了比较:

print(index.str.contains('(target)'))
print(index.str.endswith(' (target)'))

produces:产生:

[ True False False]
[ True False False]

Turns out, the default setting for the regex argument, is True .原来, regex参数的默认设置是True

  • With regex, (...) means capture everything inside, so it's trying to find ' target' instead of ' (target)'使用正则表达式, (...)表示捕获内部的所有内容,因此它试图找到' target'而不是' (target)'
  • The options to resolve the issue are:解决该问题的选项是:
    • Set regex=False设置regex=False
    • Escape the parenthesis with \(...\)\(...\)转义括号

Therefore, to get the desired behavior, there are two options:因此,要获得所需的行为,有两种选择:

# 1
index.str.contains(' (target)',regex=False)

# 2
index.str.contains(r' \(target\)')

Pass regex False, () here is regex style通过regex False, ()这里是 regex 样式

index.str.contains(' (target)',regex=False)
Out[103]: array([ True, False, False])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM