简体   繁体   English

python正则表达式:如何对多个模式字符串进行模式搜索?

[英]python regex: How can I have pattern searching for multiple pattern strings?

I have to search the following patterns in a file, (any match qualifies) 我必须在文件中搜索以下模式,(任何匹配都符合条件)

pattern_strings = ['\xc2d', '\xa0', '\xe7', '\xc3\ufffdd', '\xc2\xa0', '\xc3\xa7', '\xa0\xa0', '\xc2', '\xe9']
pattern = [re.compile(x) for x in pattern_strings]

and function using this 并使用此功能

def find_pattern(path):
    with open(path, 'r') as f:
        for line in f:
            found = pattern.search(line)
            if found:
                logging.info('found - ' + found)

When I try using it 当我尝试使用它

find_pattern('myfile')

I see AttributeError: "'list' object has no attribute 'search'" 我看到AttributeError: "'list' object has no attribute 'search'"

because patterns is 因为模式是

[<_sre.SRE_Pattern object at 0x107948378>, <_sre.SRE_Pattern object at 0x107b31c70>, <_sre.SRE_Pattern object at 0x107b31ce0>, <_sre.SRE_Pattern object at 0x107ac3cb0>, <_sre.SRE_Pattern object at 0x107b747b0>, <_sre.SRE_Pattern object at 0x107b74828>, <_sre.SRE_Pattern object at 0x107b748a0>, <_sre.SRE_Pattern object at 0x107b31d50>, <_sre.SRE_Pattern object at 0x107b31dc0>]

How can I have one pattern which looks for all strings in pattern_strings ? 如何在pattern_strings找到一个查找所有字符串的模式?

You could simply concatenate all the expressions together with a | 您可以简单地将所有表达式与|连接在一起 :

pattern_strings = ['\xc2d', '\xa0', '\xe7', '\xc3\ufffdd', '\xc2\xa0', '\xc3\xa7', '\xa0\xa0', '\xc2', '\xe9']
pattern_string = '|'.join(pattern_strings)
pattern = re.compile(pattern_string)

This does, however, assume that none of your patterns are complicated enough that a simple concatenation like this might break. 但是,这确实假设您的模式都不够复杂,以至于这样的简单连接可能会破坏。 For the ones in your example, it should work. 对于您示例中的那些,它应该工作。 For more complex patterns, it may not. 对于更复杂的模式,它可能不会。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM