简体   繁体   中英

python regex matching a multiple of an expression

I know this is probably pretty basic, but I'm trying to create a regex expression that will only match a certain multiple of a group of characters. For example, re.findall(expression, 'aaaa') will return 'aaaa' but re.findall(expression, 'aaa') will return 'aa', where expression is some regex that involves the pair aa. It will only return the entire string if the entire string is some integer multiple of 'aa'. Any ideas?

Just use (aa)+ . (For findall, you'll want to use non capturing groups, so (?:aa)+ .)

>>> re.findall('(?:aa)+', 'aa')
['aa']
>>> re.findall('(?:aa)+', 'aaaa')
['aaaa']
>>> re.findall('(?:aa)+', 'aaaaa')
['aaaa']

Try something like eg (?:(?:expression){3})+ to find all multiples of three of the expression. If the expression is shorter, you could also just write it as often as you want.

If you want to match exact duplications, try something like eg (?:(expression)\\1{2})+ for multiples of three. Note that this may require backtracking if the expression is non-trivial and thus may be slow.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM