Regex - Python: Capture three (3) words after a specific word

Question

Hello everyone I have the following code:

str1 =  "Hello, I would like to meet you at the train station of Berlin after 6 o' clock"
match = re.compile(r' at \w+ \w+ \w+')
match.findall(str1)

Is there a better way than "\\w+ \\w+ \\w" so for example to capture specific number of words?

Answer 1

Yes. To specify a particular count for the match, use curly-braces. Eg,:

match = re.compile(r'at ((\w+ ){3})')

Which gives:

>>> print match.findall(str1)
[('the train station ', 'station ')]

In general, to capture just the n words after word , your regex would be:

'word\s+((?:\w+(?:\s+|$)){n})'

Where ?: designates a "non-capturing" group, \\s designates whitespace, | means "or", and $ means "end of string". Therefore:

>>> print re.compile(r'at\s+((?:\w+(?:\s+|$)){3})').findall(str1)
['the train station ']