[英]How to match exact “multiple” strings in Python
I've got a list of exact patterns that I want to search in a given string.我有一个我想在给定字符串中搜索的确切模式列表。 Currently I've got a real bad solution for such a problem.目前,我对这样的问题有一个非常糟糕的解决方案。
pat1 = re.compile('foo.tralingString')
mat1 = pat1.match(mystring)
pat2 = re.compile('bar.trailingString')
mat2 = pat2.match(mystring)
if mat1 or mat2:
# Do whatever
pat = re.compile('[foo|bar].tralingString')
match = pat.match(mystring) # Doesn't work
The only condition is that I've got a list of strings which are to be matched exactly.唯一的条件是我有一个要完全匹配的字符串列表。 Whats the best possible solution in Python.什么是 Python 中最好的解决方案。
EDIT: The search patterns have some trailing patterns common.编辑:搜索模式有一些常见的尾随模式。
You could do a trivial regex that combines those two:你可以做一个简单的正则表达式,结合这两者:
pat = re.compile('foo|bar')
if pat.match(mystring):
# Do whatever
You could then expand the regex to do whatever you need to, using the |
然后,您可以使用|
扩展正则表达式以执行您需要的任何操作|
separator (which means or in regex syntax)分隔符(这意味着或在正则表达式语法中)
Edit: Based upon your recent edit, this should do it for you:编辑:根据您最近的编辑,这应该为您做:
pat = re.compile('(foo|bar)\\.trailingString');
if pat.match(mystring):
# Do Whatever
The []
is a character class. []
是一个字符类。 So your [foo|bar]
would match a string with one of the included characters (since there's no * or + or ? after the class).因此,您的[foo|bar]
将匹配包含其中一个字符的字符串(因为类之后没有 * 或 + 或 ? )。 ()
is the enclosure for a sub-pattern. ()
是子模式的外壳。
You're right in using |
你是正确的使用|
but you're using a character class []
instead of a subpattern ()
.但是您使用的是字符类[]
而不是子模式()
。 Try this regex:试试这个正则表达式:
r = re.compile('(?:foo|bar)\.trailingString')
if r.match(mystring):
# Do stuff
Old answer旧答案
If you want to do exact substring matches you shouldn't use regex.如果要进行精确的子字符串匹配,则不应使用正则表达式。
Try using in
instead:尝试使用in
代替:
words = ['foo', 'bar']
# mystring contains at least one of the words
if any(i in mystring for i in words):
# Do stuff
Do you want to search for patterns or strings ?您要搜索模式还是字符串? The best solution for each is very different:每个人的最佳解决方案是非常不同的:
# strings
patterns = ['foo', 'bar', 'baz']
matches = set(patterns)
if mystring in matches: # O(1) - very fast
# do whatever
# patterns
import re
patterns = ['foo', 'bar']
matches = [re.compile(pat) for pat in patterns]
if any(m.match(mystring) for m in matches): # O(n)
# do whatever
Edit: Ok, you want to search on variable-length exact strings at the beginning of a search string;编辑:好的,您想在搜索字符串的开头搜索可变长度的精确字符串; try尝试
from collections import defaultdict
matches = defaultdict(set)
patterns = ['foo', 'barr', 'bazzz']
for p in patterns:
matches[len(p)].add(p)
for strlen,pats in matches.iteritems():
if mystring[:strlen] in pats:
# do whatever
break
perhaps可能
any([re.match(r, mystring) for r in ['bar', 'foo']])
I'm assuming your match patterns will be more complex than foo or bar;我假设你的匹配模式会比 foo 或 bar 更复杂; if they aren't, just use如果不是,请使用
if mystring in ['bar', 'foo']:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.